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PREFACE 


This  report  has  documented  the  results  of  the  DAIS  study  contract.  The 
results  of  three  areas  of  work  will  be  detailed  throughout  the  following 
pages. 

First,  the  results  and  ensuing  recommendations  of  an  instruction  set 
analysis  are  presented  in  Section  2.  Paramount  in  this  work  is  the 
selection  of  base  addressing  as  the  most  effective  method  of  achieving 
greater  software  efficiency.  Indeed,  base  addressing  yields  a 30  percent 
improvement  in  software  efficiency  when  compared  with  the  current  AYK-15 
instruction  set.  Also,  new  data  formats  for  floating-point  number  representa- 
tion were  analyzed  along  with  integer  and  fractional  representations  for  fixed- 
point  numbers . 

The  conclusions  of  this  software  analysis  were  are  i presented  as  a re- 
commended instruction  matrix  in  Table  2.  This  instruction  set  is  then 

"subsetted"  for  the  Low  Level  Machine  and  presented  in  Table  3. 

Second,  the  hardware  and  firmware  impact  of  implementing  the  instruc- 
ion  set  of  Table  2 on  the  current  AYK-15  computer  is  analyzed  in  section  3. 

The  cost  impact  of  the  proposed  changes  are  summarized  in  Table  7. 


Finally,  the  instruction  set  of  Table  3 is  used  to  investigate  the  design 
of  a low-level  number  of  the  AYK-15  based  computer  family.  Whenever 
appropriate,  performance  is  sacrified  to  achieve  a minimum  parts  count 
for  the  Low-Level  Machine.  During  this  investigation,  floating-point  in- 
structions are  aLso  incorporated  into  the  LLM  design.  The  results  of  the 
design  are  tabulated  and  presented  in  terms  of  performance  (instruction 
speeds),  parts  and  power. 
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This  study  shows  the  desirability  and  practicality  of  generating  a 
family  of  military  computers  based  upon  the  present  AYK-15.  With  the 
modifications  outlined  in  this  report,  the  AYK-15  and  the  LLM  provide 
sound  basis  for  developing  a family  of  airborne  digital  computers. 
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SECTION  I 
PURPOSE 


This  document  is  a final  report  summarizing  all  facts  and  conclusions 
found  and  drawn  in  the  course  of  fulfilling  DAIS  Study  33615-76-C-1292,  often 
referred  to  as  the  "DAIS  STUDY.  " The  purpose  of  the  contract  has  been  to 
establish  a modified  instruction  set  for  the  present  DAIS  computer  (AYK-15) 
and  to  select  a cutset  of  this  instruction  set  to  implement  a lower  perfor- 
mance, upward  compatible  computer.  This  report  serves  as  a basis  for  the 
definition  of  an  upward  compatible  computer  family  for  the  Air  Force. 

A preliminary  hardware  design  of  the  lower  performance  computer  was 
then  performed  and  is  included  in  this  report. 

Finally,  the  impact  of  modifying  the  present  DAIS  computer  (AYK-15)  to 
implement  the  instruction  set  modifications  was  investigated. 

1.  I INSTRUCTION  SET  CHOICE 

At  the  outset,  a preliminary  instruction  set  was  chosen  by  the  AFAL  for 
Westinghouse's  use  as  a baseline  in  its  analysis  to  determine  an  optimal 
instruction  set,  from  a hardware/firmware  viewpoint  as  well  as  a program- 
mer's, for  the  proposed  computer  family.  Paramount  In  the  choice  of  this 
instruction  set  (Appendix  A of  the  original  contractSOW)  was  the  need  to 
conserve  the  actual  memory  space  required  to  encode  operational  avionics 
programs.  It  was  recognized  that  the  best  way  to  implement  this  saving  was 
to  create  single-length  memory  reference  instructions  (16  bits  long)  which 
could  generate  a 16-bit  effective  memory  address  (to  reference  up  to  65K 
words). 

Several  new  addressing  modes  were  proposed  as  methods  of  synthe- 
sizing 16-blt  memory  reference  instructions? 
a.  Register  Indirect  Addressing 
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b.  Register  Indirect  With  Auto  Increment 

c.  Base  Relative  Addressing 

d.  Instruction  Counter  Relative  Addressing 

e.  Immediate  Short  Formats 

f.  Immediate  Long  Formats 

0/  these  new  addressing  modes  the  most  significant  in  terms  of  soft* 
ware  efficiency  (defined  by  AFAL  purely  in  terms  of  the  total  number  of 
16-bit  words  required  to  code  programs)  were  determined  to  bs  Register 
Indirect  and  Base  Relative  addressing.  Since  both  types  are  each  capable 
of  synthesizing  16-blt  memory  reference  instructions,  they  were  posed 
as  alternatives  in  the  selection  of  the  final  instruction  set  Their  relative 
strengths  were  then  explored  by  coding  a sample  avionics  problem 
supplied  by  the  Air  Force  in  each  instruction  set  (i.  e. , Register  Indirect 
and  Base  Addressing). 

1.  2 HARDWARE  MODIFICATIONS  TO  PRESENT  DAIS  COMPUTER 
After  the  software  analysis  of  the  proposed  instruction  seta  was 
completed,  the  tank  of  implementing  the  addressing  modes  within  the 
framework  of  the  present  DAIS  computer  was  studied.  This  was  undertaken 
in  two  ways:  first  considering  only  firmware  (microcode)  cnangea  to  the 
present  AYK-15  computer  with  no  hardware  changes,  and  secondly  with 
complete  freedom  to  modify  or  add  to  hardware  as  well  as  firmware. 

At  this  point,  the  feasibility  of  the  goal  of  30  percent  improved  software 
efficiency  over  the  present  AYK-15  computer  with  the  new  addressing  modes 
was  analyzed  with  respect  to  hardware/firmware/cost  tradeoffs,  and  a final 
instruction  set  chosen. 

1.  3 DESIGN  OF  LOW  LEVEL  DAIS  MACHINE  (LLM) 

Another  concern  in  the  choice  of  the  optimal  instruction  set  was  the 
feasibility  of  subsetting  the  final  set  for  the  less  powerful  members  of 
die  computer  family.  This  subsetting  also  had  to  maintain  an  "upwards 
compatibility"  within  the  family,  meaning  all  instructions  used  by  the 


"low  level"  machines  would  be  contained  in  the  "higher  level"  machines. 
This  insures  that  operational  software  which  would  run  on  the  low  level 
machine  would  also  run  on  any  of  the  higher  level  machines  in  the  family. 

We stinghouse  and  AFAL  then  chose  one  such  subset  of  the  final 
instruction  set  for  use  in  its  design  of  a low-level  machine  (LLM). 

Generally,  this  subset  contained  ail  instructions  of  the  final  set  except 
the  floating  point  arithmetic  and  double-precision  multiplies  and  divides, 

keeping  the  LLM  oriented  towards  a simple,  fixed  point,  front-end 
processor.  (Subsequently,  floating  point  arithmetic  was  added  to  tile  LLM 

during  the  design  phase.  ) 

A preliminary  hardware  design  of  the  LLM  was  then  performed.  Para- 
mount in  this  design  was  the  use  of  the  2900  family  of  bipolar  LSI  logic, 
which  has  emerged  as  a front-runner  in  the  rapidly-expanding  technology  of 
the  LSI  field.  As  currently  supplied  by  Advanced  Micro  Devices  (AMD), 
Motorola,  and  Raytheon,  this  logic  family  meets  Mil-Spec  performance 
criteria,  provides  low  parts  count  design  with  low  power  consumption,  and 
is  reliably  available  on  the  market.  The  AM-2901  four-bit  microprocessor 
slice  is  also  structurally  compatible  with  the  MM-5701  (used  in  the  AYK-15), 
making  the  LLM  design  directly  applicable  to  the  AYK-15. 

The  primary  difference  in  the  two  instruction  sets  was  the  two 
addressing  modes.  Each  set  contained  a "core"  of  present  DAIS 
instructions  (referred  to  as  "DAIS  Baseline"). 

The  AFAL  supplied  a set  of  three  sample  avionics  programs  (DAIS 

t 

Benchmarks  l,  2,  and  3)  which  are  detailed  in  the  document  specification 
number  F44615-75-R-1154.  Of  the  three,  BENCHMARK  No,  1 was  chosen  by 
Westinghouse  for  coding  in  the  two  candidate  instruction  sets. 

BENCHMARK  No.  1 was  divided  into  six  program  segments  as 
follows: 

(1)  Decision  and  Control 

(2)  Arithmetic  Computation  No.  1 and  2 
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(3)  Arithmetic  Computation  No.  3 

(4)  Arithmetic  Computation  No.  4 and  5 

(5)  LIMIT  Subroutine 

(6)  HMSANG  Subroutine 

This  partitioning  was  made  both  to  facilitate  documentation  and  to  provide 
for  statistical  comparison.  It  isolated  Decision  and  Control,  arithmetic 
processing,  and  certain  subroutines  for  individual  scrutiny. 

The  statistical  comparison  was  done  in  two  reference  frames.  First, 
the  relative  software  efficiency  (as  defined  in  Paragraph  1.  1)  of  the  two 
instruction  sets  from  the  coding  of  Benchmark  No.  I was  analyzed. 

Then  the  frequency  of  usage  of  the  non-DAIS  Baseline  instructions 
(as  defined  earlier)  of  each  instruction  set  in  the  coding  of  the  program 
was  analyzed.  This  highlighted  the  relative  "strengths"  of  the  new 
instructions  in  each  set  by  pointing  out  how  useful  each  was  in  solving 
the  Benchmark  problem. 
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SECTION  II 


INSTRUCTION  SET  DEFINITION 

2.  1 FAMILY  CONCEPT  (UPWARDS  COMPATIBILITY) 

If  a set  of  computers,  all  with  varying  degrees  of  processing  capabili- 
ties, are  to  be  considered  a computer  "family,  " there  must  be  a direct 
interrelation  among  them.  A valid  measure  of  the  notion  of  a computer 
family  is  the  "upwards  compatibility"  of  the  machines.  This  can  be 
determined  directly  from  whether  or  not  a fully  operational  program 
written  for  a smaller  member  of  the  "family"  can  be  run  directly  on  a 
larger  family  member  with  the  same  results. 

To  this  end,  the  computer  family  must  be  "upwards  compatible"  in 
terms  of  software.  An  instruction  set  for  the  "higher"  level  members 
of  the  family  should  be  conveniently  subsettable  for  the  "lower"  level 
family  members. 

Furthermore,  a hardware  compatibility  must  be  maintained  within 
the  family.  A fixed  set  of  machine  characteristics  should  be  incorporated 
in  each  family  member,  with  extensions  added  to  this  basic  set  for  the 
higher  level  machines.  This  is  done  to  insure  family  integrity  in  data 
formats,  interrupt  service,  and  the  like. 

2,  2 SOFTWARE  EFFICIENCY  STUDY  OF  NEW  ADDRESSING  MODES 
As  outlined  in  Paragraph  l.  I of  this  report,  two  candidate  instruction 
sets  (base  relative  and  register  indirect)  were  assembled  to  compare  the 
relative  strengths  of  the  register  indirect  and  base  register  addressing 
formats.  The  register  indirect  instruction  set  was  as  defined  in  Appendix 
A of  contract  F33615-76-A-1292.  (DAIS  Study)  The  base  addressing  inst- 
ruction set  used  was  as  defined  in  the  Westinghouse-prepared  document 
entitled  DAIS  Processor  Support  Software  (specification  no,  MN255R81S). 
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a.  RESULTS  OF  SOFTWARE  ANALYSIS 
With  the  Benchmark  completely  coded  in  both  the  Register  Indirect  and 
Base  Addressing  sets,  an  algorithm  was  devised  to  measure  the  relative 
software  efficiency  of  the  sets.  Using  the  line  numbers  associated  with 
the  program  listings,  a numerical  equation  for  computing  the  number  of 
16— bit  instruction  words  needed  by  each  program  segment  was  formulated: 
AN.  = (END-BGN)  - CMT 
BN.  = (END-BGN)  - CMT 
Where: 


AN  = The  number  of  words  (instructions  plus  literals)  required  to 
code  in  AFAL  instruction  set. 

BN  = As  above  for  Base  Register  instruction  set. 

END  = Line  number  of  last  line. 

BGN  = Line  number  of  first  line  less  one. 

CMT  = Number  of  comment  lines. 

and  i = 1,  2,  ....  6 corresponding  to  one  of  six  program  segments. 

Substituting  into  these  equations  yielded  the  following  results: 


(1)  Decision  and  Control 
AN  = (207-19)  - 14  = 174 
BNJ  = (177-19)  - 14  = 144 

(2)  Arithmetic  Computation  No.  1 & 2 
AN  = (195-3)  - 2 = 190 

BN^  = (140-3)  - 2 = 135 

(3)  Arithmetic  Computation  No.  3 
AN  = (97-3)  - 5 = 89 

BN ^ = (75-3)  -7=65 

(4)  Arithmetic  Computation  No.  4 &:  5 
AN  = (172-3)  - 3 = 166 

BNJ  = (144-3)  - 3 = 138 

4 

(5)  LIMIT  Subroutine 
AN  = (39-3)  - 1 = 35 

BN:  = (38-3)  - l = 34 

5 


AN 


l 


BN, 


AN. 

i 

BN. 


AN. 

bn! 


AN 

d 

bn" 


AN  j 

BN" 


= l.  32 


= l.  41 


= l.  34 


= l.  20 


= 1.  03 
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(6)  HMSANG  Subroutine 
AN,  = (98-3)  - 2 = 93 
BN°  = (79-2)  ~ 2 = 75 

TOTALS: 

AN  = 747,  BN  = 591 
These  results  show  the  Base  Register  set  of  instructions  required  less 
program  memory  than  the  AFAL  set  in  all  six  program  segments.  In 
total,  the  AFAL  set  used  27  percent  more  program  storage  than  did  the 
Base  Register  set. 

In  fact,  the  AFAL  set  requires  more  storage  than  is  reflected  in  the 
above  figures.  Each  time  a unique  address  is  loaded  into  the  general 
register  used  as  the  "indirect  register"  an  additional  location  is  required. 
The  required  word  holds  the  constant  whose  value  is  equal  to  the  address  in 
question.  For  example,  on  page  51  of  the  program  listing,  three  locations 
would  be  required  to  save  the  values  loaded  into  register  A4  on  lines  59,  63, 
and  74  respectively.  This  is  different  from  the  base  addressing  mode, 
which  can  address  uniquely  within  its  8-bit  displacement  range  (256  words) 
with  the  original  base  loaded  only  at  the  beginning  of  all  references  within 
its  boundaries. 

b.  INSTRUCTION  UTILIZATION  (AFAL) 

Of  the  115  AFAL  instructions  only  17  were  used  in  coding  the  Benchmark 
problem.  A detailed  list  follows: 

INSTRUCTION  NO.  OF  TIMES 

USED 


(l) 

RDA 

1 

(2) 

IRS 

2 

(3) 

RDS 

l 

(4) 

IRM 

2 

(5) 

RDM 

1 

(6) 

RDD 

1 

(7) 

RST 

8 

(8) 

IRST 

17 

(9) 

DRST 

2 

(10) 

IDST 

9 

(ID 

IRL 

7 

AN 

bn" 


= 1.  24 


AN 

BN 


= 1.26 
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(12) 

IRDL 

2 

(13) 

JRU 

12 

(14) 

JREQ 

7 

(15) 

JRGT 

9 

(16) 

JRLT 

9 

(17) 

RSB 

2 

The  ratio  of  instruction  types  available  to  instruction  types  used:  17/115 


The  ratio  of  the  number  of  Register  Indirect  instructions  (of  all  types) 
used  to  the  total  number  of  instructions  required  for  each  of  the  six  pro- 
gram segments  are: 

(1)  14/117  = 0.12 

(2)  13/137  = 0.  09 

(3)  6/55  = 0. 11 

(4)  16/111  = 0.  14 

(5)  12/27  = 0.  44 

(6)  0/67  =0.0 

TOTAL  61/514  = 0.12 

c.  INSTRUCTION  UTILIZATION  (BASE  ADDRESSING) 

Of  the  60  Base  Addressing  instructions  only  18  were  used.  They  were 
as  follows: 


INSTRUCTION 


NO.  OF  TIMES 
USED 


(D 

LB,  BR5 

55 

(2) 

STB,  BR5 

45 

(3) 

AB,  BR5 

9 

(4) 

SBB,  BR5 

7 

(5) 

SBB,  BR6 

1 

(6) 

MB,  BR5 

15 

(7) 

DB,  BR5 

2 

(8) 

DLB,  BR5 

18 

(9) 

DLB,  BR6 

l 

(10) 

DSTB,  BR5 

20 

(11) 

DAB,  BR5 

14 

(12) 

DSBB,  BR5 

13 

(13) 

JCRI,  EQ 

6 

(14) 

JCRI,  LT 

10 

(15) 

JCRI,  GT 

8 

(16) 

JCRD,  EQ 

1 

8 


(17) 

(18) 


J'R  I 
JRD 


4 

3 


The  ratio  of  instruction  types  available  to  instruction  types  used:  18/60 
= 0.  30. 

The  ratio  of  the  number  of  Base  Addressing  instructions  (of  all  types) 
used  to  the  total  number  of  instructions  required  for  each  of  the  six 
program  segments  are: 

(1)  70/118  = 0.60 

(2)  53/120  = 0.  44 

(3)  29/55  = 0.  53 

(4)  59/116  = 0.  51 

(5)  9/25  = 0.  36 

(6)  13/60  = 0.  22 

TOTAL  233/494  = 0.47 

These  ratios  show  the  set  of  base  addressing  instructions  to  be  more 
applicable  than  the  register  indirect  addressing  instructions  in  a typical 
avionics  problem  (such  as  Benchmark  No.  1),  both  in  having  more  of  its 
instructions  applicable  in  the  codings  (30  percent  to  15  percent)  and  the 
overall  frequency  of  their  use  (48  percent  to  18  percent). 

Table  1 summarizes  the  above  figures  from  the  comparison  of  the 
two  instruction  sets. 

d,  CONCLUSIONS  OF  SOFTWARE  ANALYSIS 

In  terms  of  software  efficiency,  it  is  apparent  register  indirect 
addressing  is  a poorer  choice  for  a short  memory  reference  instruction 
mode  than  base  register  addressing.  We  can  see,  from  our  coding  of 
Benchmark  No.  1,  a significant  savings  in  memory  utilization  with  the 
base  addressing  mode  (27  percent  less  memory  space  than  register 
indirect). 

From  the  view  of  utility  of  instructions,  the  base  register  addressing 
mode  again  appears  to  be  a better  choice.  A larger  percentage  of  avail- 
able base  addressing  instructions  was  used  (30%)  than  register  indirects 
(15%),  and  these  instructions  were  used  with  over  twice  the  frequency 
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TABLE  1 


INSTRUCTION  SET  COMPARISON 


Program 

Segment  (1) 

(2) 

(3) 

(4) 

(5; 

(6) 

Total  j 
Program 

MEM  Usage  * 22% 

x (AFAL)/ 

N (BA) 

41% 

34% 

20% 

3% 

24% 

27% 

AFAL  Instr  38% 

Utilization** 

9% 

11% 

14% 

44% 

0% 

18% 

BA  Instr  60% 

Utilization  ** 

44% 

53% 

51% 

36% 

22% 

47% 

Notes:  * Reflects  the  percentage  by  which  the  AFAL  program  storage 

requirement  exceeded  the  Base  Addressing  program  storage 
storage  requirement. 


**  Reflects  the  perqent  of  the  total  instructions  which  were  AFAL  (or  BA) 


77.0819-Ta.-1 


(47  percent  to  18  percent)  than  the  register  indirects  in  the  solution  of 
Benchmark  No.  1.  This  indicates  the  base  addressing  instructions  are 
"richer"  in  utility  for  solving  typical  avionics  problems  than  register 
indirects,  despite  being  almost  half  as  small  a set  of  instructions  (60  to 
115).  This  is  also  a plus  for  base  addressing,  as  less  instruction  order 
types  are  necessary  for  greater  utility. 

In  the  process  of  analyzing  the  proposed  addressing  mode  changes,  many 

conclusions  were  reached  by  the  programmers  who  performed  the  actual 
coding.  What  follows  is  a summary  of  their  comments  about  the  proposed 
instruction  changes. 

For  purposes  of  tn is  discussion,  the  following  instruction  word  field 
definitions  are  used 

order  type  code 

general  register  RO,  . . . R15 
general  register  used  to  designate  an  address 


OT 
R . 


R 


EA 
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Rg  - general  register  used  as  a base  address  register 
D - displacement  field 
N - binary  number 
OCX  - operation  code  extension 
EXP  - exponent 
2.  2.  i Register  Indirect 


Instruction  format: 

OT 

ra 

rea 

16  9 

8 5 4 1 

Register  Indirect  addressing  is  an  efficient  addressing  mode  when  there  are 
repeated  references  to  the  same  location.  When  combined  with  auto- 
indexing  this  advantage  is  extended  to  enhance  references  to  adjacent 
locations.  As  can  be  imagined,  if  a program's  data  can  be  structured 
sequentially  the  register  indirect  addressing  can  provide  an  increase  in 
software  efficiency  over  double  word  instructions. 

However,  if  the  data  base  cannot  be  structured  in  sequential  nature  (as 
will  typically  be  true  of  all  global  data  blocks),  then  register  indirect 
addressing  will  be  of  very  limited  use.  As  an  example,  consider  the  two 
subroutines  below.  Both  subroutines  are  constrained  to  use  data  from  a 
global  block  of  data  as  is  typical  of  many  data  structures. 

SUBROUTINE  A GLOBAL  DATA  SUBROUTINE  B 


RO  = A/B  # C/D 


VAR  A 
VAR  B 
VAR  C 
VAR  D 


RO  = (“)+  C + A 


Structured  vs  Non- structured  Data 

Both  subroutines  are  required  to  perform  operations  from  left  to  right 
in  order  to  prevent  overflow  or  underflow.  As  can  be  quickly  appreciated, 
Subroutine  A is  ideally  suited  for  implementing  with  register  indirect 


II 


addressing  since  its  parameters  are  stored  in  the  exact  sequential  order 
they  are  needed  for  computation.  However,  Subroutine  B requires  a 
different  ordering  of  the  global  variables  in  order  to  use  register  indirect 
addressing.  Of  course,  some  compromise  of  the  sequence  of  the  four 
variables  may  be  arrived  at  to  allow  both  Subroutine  A,  and  Subroutine 
B,  to  utilize  register  indirect  addressing  of  their  shared  variables. 
However,  as  the  number  of  users  of  the  global  variables  grow,  the  task 
of  organizing  the  data  in  an  optium  fashion  for  each  subroutine  user 
becomes  truly  Herculean. 

It  is  primarily  for  this  reason  that  register  indirect  addressing  is 
inadequate  for  the  computer  family.  Additionally,  once  a program  is 
written,  the  order  of  storage  of  the  variables  may  never  be  altered 
without  a major  rewriting  of  the  program  itself.  This  makes  program 
revision  doubly  difficult  and  is  certainly  not  in  keeping  with  good  program- 
ming practices. 

2,  2.  2 Base  Relative 


Instruction  format: 

OT 

i 

OFFSET 

l 5 

16  II  10  98  1 


In  the  process  of  arriving  at  the  present  set  of  Base  Relative  instruct- 
ions, Westinghouse  relied  heavily  on  its  experience  with  the  predecessor 
of  DAIS,  the  Minicomputer,  This  machine  used  a similar  form  of  base 
addressing  with  an  eight-bit  displacement. 

Although  not  as  convenient  for  coding  as  double-word  instructions,  base 
addressing  has  proven  effective  in  reducing  the  memory  required  to  per- 
form avionics  problems.  Inherent  in  the  use  of  base  addressing  is  a 
careful  planning  of  the  data  structure  in  order  to  take  advantage  of  the 
limited  addressing  range.  It  is  for  this  reason  that  four  base  registers 
were  chosen.  Ln  a typical  problem  R4  would  be  used  to  access  a list  of 
global  data.  Similarly,  R5  would  be  used  to  access  all  local  variables 
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while  R6  would  reference  a block  of  "scratch  pad"  for  computation  and 
intermediate  results.  The  last  base  register,  R7,  would  then  be  free. 


The  main  disadvantage  to  base  addressing  is  the  restriction  to  a single 
accumulator.  This  definitely  presents  problems  when  compared  to 
multiple  register  capability.  However,  the  full  set  of  register  to 
register  instructions,  as  well  as  the  double  word  instructions,  are 
available  when  it  is  necessary  to  perform  operations  on  registers  other 
than  RO. 

The  base  addressing  instructions  are  not  intended  to  be  used  solely  in 
a particular  application  but  rather  as  a supplement  to  the  normal  AYK-15 
instructions  when  memory  efficiency  is  desired.  To  this  end  they  would 
be  used  or  disregarded  as  the  particular  application  dictates. 

2.  2.  3 Immediate  Short 


Type  l:  Instruction  format: 


Type  2:  Instruction  format: 


OT 

Ra  SD  - D 

A 6 0 

16  13  12  9 8 1 

is  a signed  seven-bit  integer) 

OT 

BA  °3  ■ °0 

16 


9 8 


5 4 


(Dj  - Dq  is  an  unsigned,  four-bit  integer  whose 
sign  is  determined  by  a bit  in  the  Order  Type  code  field) 

Type  I's  format  for  the  immediate  short  would  require  48  order  type 
codes  to  implement  only  three  types  of  instructions  (Load,  Add,  & Com- 
pare). Since  this  comprises  close  to  20  percent  of  the  total  number  of 
order  types  available,  their  usage  would  have  to  be  extremely  high  to 
justify  their  inclusion.  None  of  these  instructions  were  appropriate  for 
use  in  the  software  analysis  performed.  This  high  number  of  order  types 
is  too  much  to  pay  for  three  instructions  which  could  not  be  used  in  che 
programs  coded. 

Type  2's  format  requires  fewer  order  type  codes  (six  for  the  three 
instructions  mentioned  above),  but  again  has  a similar  lack  of  utility. 
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The  value  of  an  immediate  short  instruction  comes  into  focus  when  a 
large  number  of  calculations  are  done  with  small  integer  constants,  such 
as  one,  two,  and  the  like.  This  was  not  the  case  in  Benchmark  No.  1. 
Further,  since  short  instruction  types  are  the  primary  goal,  the  load 
and  add  immediate  short  instructions  may  be  performed  with  the  more 
general  base  addressing  instructions,  (This  would  require  the  alloca- 
tion of  a literal  in  a global  data  block). 

2.  2.  4 Jump  Conditional  (IC  Relative) 


Instruction  format: 


OT 

SD.  • • 

■ * * 

6 

O 

16 


9 8 


IC  = IC  + (D Dq) 

This  addressing  mode,  whose  signed  displacement  allows  conditional 
jumping  within  127  locations  of  the  present  IC  value,  is  definitely 
advantageous  in  increasing  software  efficiency.  In  solving  the  Benchmark 
problem  it  was  applicable  for  use  in  approximately  10  percent  of  the  entire 
program.  It  is  an  ideal  short  format  for  program  loops  and  small  distance 
jumps. 


2.  2.  5 Jump  to  Subroutine  (IC  Relative) 


Instruction  format: 


OT 


lT 


SD, 


D 


13  12 


9 8 


It  is  questionable  that  subroutines  could  be  located  within  the  range  of 
this  instruction  with  high  frequency.  Unlike  the  jump  conditional  instruct- 
ion discussed  above,  most  subroutines  will  not  typically  be  co-located  to 
their  calling  points  in  the  main  program,  as  illustrated  by  the  Benchmark 
program.  This  is  not  a desirable  instruction. 

2.  2.  6 Stack  (PSH/POP) 

We  would  agree  with  AFAL  in  its  recommendation  for  register  to 
memory  stack  instructions.  Since  multiple  stacking  and  unstacking  of 
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registers  is  desirable  in  many  program  applications  (subroutines,  argu- 
ment passing,  interrupt  save  status,  etc.  ),  we  would  suggest  the  following 
formats: 


Push  instruction* 


OT 

N 

| R 

! A 

16  9 8 5 4 1. 

Jra,  . , ra  J -stack 

1 A A + N 


Pop  instruction: 


OT 

N 

R A 

A 

16  9 8 5 4 1 


Top  N locations  on  stack  — |R^  , R^  j 

(R15-N-1)  - R15 


It  is  assumed  that  R15  is  the  implied  stack  pointer.  Therefore,  the  PSH 
and  POP  instructions  may  be  used  for  handling  multiple  registers.  Of 
course,  if  N =0  a single  register  will  be  transferred. 

The  use  of  the  stack  as  an  argument-passing  instrument  is  detailed  in 
Paragraph  2,  6,  Re-entrant  Subroutines,  of  this  report. 


2.  2.  7 Immediate  Long  Formats 
Instruction  Format: 


OC 

RA 

OCX 

I 

16  9 8 5 4 il5  T 


This  becomes  the  format  for  all  immediate  long  instructions.  Each  of 
the  16  possible  instructions  is  distinguished  by  its  code  in  the  4-bit 
extended  op  code  field  OCX,  Using  the  OCX  field  as  such,  eliminates  any 
indexed  immediate  long  instructions. 

The  advantage  of  this  format  comes  from  the  abiliiy  to  compress  all  the 
AYK-15  immediate  addressing  instructions  into  a single  order  type  code  with 
unique  OCX  codes.  However,  the  ability  to  index  the  operand  is  sacrificed. 

Since  immediate  addressing  is  not  an  important  addressing  mode  (never 
used)  in  Benchmark  Mo.  I,  it  would  appear  that  changes  to  the  immediate 
addressing  structure  of  the  AYK-15  have  little  impact  on  software  efficiency. 
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3 N'r'W  l 'AT A FORMATS 


A tiuiubU  #«st  ol’  data  formats  was  to  be  chosen  for  the  computer  family, 
both  (or  iUed-potm  and  floating-point  numbers.  Both  hardware  and  soft- 
ware traneof/u  were  made  for  each  format. 

>.  I Fixed « Point  Multiply  und  Divide 

-*•  ■ mtiNTM •-■y  ri — ttviY — i n— i i m ■ - 

A i lx«  d-polnt  number  notation  must  be  considered  when  fixed-point  multi- 
ply and  divide  instructions  are  designed  and  implemented.  The  choice  for 
xuch  a notation  cornea  down  purely  to  choosing  the  position  of  the  binary 
point,  U U'Wt  binary  point  is  placed  at  the  left  end  of  the  16 -bit  number, 
between  the  sign  bit  and  magnitude  bits,  the  machine  is  called  fractional.  If 
the  sign  bit  is  placed  at  the  extreme  right  ond  of  the  number,  at  the  right  of 
the  16  magnitude  bits,  the  machine  is  considered  to  be  integer: 


F factional 


Intege  r 


Since  the  choice  of  fractional  c-r  integer  representation  has  no  signi- 
ficant impact  upon  the  hardware,  the  choice  is  truly  one  of  convention. 
This  is  Illustrated  by  the  widespread  use  of  both  conventions  by  the  mill 
tary  computer  community: 


M 4 CHINE 

MANUFACTURER 

NUMBER  CONVENTION 

(U 

CP-1138 

(HARPOON) 

We  jtinghouse 

fractional 

(2) 

AN/  YK-15 
(DAIS) 

Westingnouse 

fractional/ integer 

(3) 

SKC-2000 

Singer-Kearfott 

fractional 

(4) 

A P - 1 

IBM 

fractional 

(5) 

4-Pi 

rBM 

fractional 

(6) 

AN/UYK-  30 

Hughe s Aircraft 

Fractional 

(■) 

AN  - UYK-20 

Uni  vac 

integer 

( 
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The  fractional  representation  is  more  common,  but  again,  this  is 
merely  a convention.  Perhaps  the  only  area  where  one  notation  is  pre- 
ferable would  be  when  calculating  indices  into  an  array  of  data.  Here, 
integer  representation  would  be  more  convenient. 

Since  AFAL  has  expressed  a preference  for  integer  notation,  we  would 
propose  that  all  fixed-point  multiplies  and  divides  be  made  to  conform  to 
the  integer  format. 

Also,  we  would  recommend  that  single  precision  multiplies  return  a 
full  32-bit  product.  This  allows  for  retention  of  added  significance  during 
single  precision  computations  and  is  common  practice.  A summary  of  the 
proposed  multiply  and  divide  instructions  follows. 

a.  MULTIPLY 

(1)  16 -bit  MPY  (M,  MR,  MI,  MIM) 

- MPY  algorithm  is  integer 

- 32-bit  result  returned  in  Ra  and  R + l (where  R is  even) 

A A A 

(2)  16 -bit  MPY  (MS,  MSR,  MSI,  MSIM) 

- MPY  algorithm  is  integer 

- 16  - bit  result  returned  in  R A 

A 

(3)  32-bit  MI*Y  (DM,  DMR,  DMI) 

- MPY  algorithm  is  integer 

- 32-bit  result  returned  in  R , R + 1 (where  R is  even) 

A A A 

b.  PIVi-L  E 

(1)  16-bit  Divide  (D,  DR,  DI,  DIM) 

- Divide  algorithm  is  integer 

- 32-bit  divident  in  R , (R  + l)  is  divided  and  quotient 

A A 

returned  in  R,  and  remainder  is  returned  in  R f l (R,  is 
A A A 

even) 

(2)  16 -bit  Divide  (DV,  DVR,  DVI,  DVIM) 

- Divide  algorithm  is  integer 

- 16  — bit  dividend  in  R is  divided,  quotient  returned  in  R , 

A A 

remainder  returned  in  R + l (R  is  even) 

A A 
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(3)  32-bit  Divide  (DD,  DDR,  DDI) 

- Divide  algorithm  is  integer 

- 32-bit  quotient  is  returned  in  R and  R + 1,  remainder  ij 

A A 

not  saved 

2.  3.  2 Floating  Point  Format 

The  choice  of  a floating  point  format  presents  a different  type  of  pro- 
blem than  the  fixed-point  choice.  A floating-point  format  definitely  impacts 
the  amount  of  hardware  necessary  for  floating-point  calculations.  Its 
choice  can  also  affect  a utility  and  readibility  to  the  programmer. 

Westinghouse,  in  its  present  AYK-15  configuration,  has  used  the  following 
32-bit  format  for  its  single-precision  floating-point  word: 




24  bits 

4*— — 

8 bits  ^ 

F c 

i 

.MANTISSA 

1 S 
i , 

EXP 

mantissa  sign^> 

mantissa  binary 
point  placement 

L _ 

exponent  sign 

Each  bit  of  the  24-bit  mantissa  (fractional  r.otationjis  as  follows: 

, _l  -2  -23 , 

| (Sign)  2 2 ...2 

The  exponent  (8  bits)  is  in  a two' s-complement  notation,  with  the  follow-/ 
ing  format: 


((Sign)  2^  25  . . . 2°) 


On  a sliding  scale,  from  nexidecimal  00,,  to  FF  ,,  the  exponent  would 

.16  16 


appear  as  follows: 


FF 


80 

7F 


00  -h  2 


-128 

127 


The  AFAL  has  suggested  a slightly  different  format  for  a 32-bit 
floating  point  number: 


32 


31  30 


24 


23 


EXP 


MANTISSA 


T 


^binary  point  placement 


exponent 

sign 


manti  ssa 
sign 
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The  mantissa,  while  separate  from  its  sign  bit,  has  the  same  24-bit 
meaning  as  in  the  Westinghouse  format.  The  AFAL  has  suggested,  how- 
ever that  the  S-bit  exponent  be  considered  as  an  excess-128  number, 
meaning  the  actual  exponent  value  is  "offset"  by  positive  128  . Cn  a 

hexidecimal  sliding  scale  this  looks  like: 

.127 
FF 


81 

80 

7F 


00  -L  2 


4-  2 


-1 


-128 


The  two  notations  give  both  the  same  mantissa  significance  and 
exponent  range  (128  < EXP  < 127).  However,  their  individual  placement 
in.  the  32-bit  word  field  turns  into  a non-trivial  difference. 

From  an  aesthetic  viewpoint,  both  formats  have  pluses.  The  Westing- 

house  notation  may  be  slightly  more  readable,  being  in  the  familiar 

...  , , . , ..  _ (sign)EXP 

scientific  notation  order  (sign).  Mantissa  X 2 . The  AFAL 

-128 

notation,  on  the  other  hand,  has  a floating  point  zero  (0X2  ) 

equivalent  in  hexidecimal  of  all  zeroes  (00000000,.)  where  the  We  stinghouse 

16 

format  is  hex  80  (00000080  ,). 

16 

The  individual  programmer  can  also  find  merits  to  either  convention. 

In  the  AFAL  format,  a relative  measure  of  the  sizes  of  two  floating  point 
numbers  can  be  obtained  by  comparing  their  integer  values,  as  the  major 
size  indicator  (exponent)  is  in  the  most  significant  bits  of  the  word  and  is 
on  a graduated,  smallec  t-to-largest  linear  scale.  This  does  not  "drop 
out"  directly  from  the  W'estinghouse  format. 

The  Westinghouse  format  has  the  programmer ' s advantage  of  being 
directly  accessible  to  exponent  scaling  via  the  machine ' s bvte-mode 
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instructions,  as  the  exponent  falls  on  an  eight-bit  boundary.  The  programmer 
can  do  a load  byte  from  memory,  add,  and  store  byte  to  accomplish  this 
directly. 

These  differences  pale,  however,  when  compared  to  the  differences  in 
the  hardware  implemented  for  floating-point  arithmetic.  The  Vde stinghouse 
f ormat  makes  it  simple  to  "strip"  the  exponent  from  the  mantissa  for 
processing,  and  since  the  exponent  is  in  two's  complement  notation,  a 
simple  addition  or  subtraction  provides  the  proper  new  exponent  in  multi- 
plication or  division  directly.  Exponent  over  or  underflow  also  falls  out 
directly  with  no  new  or  extra  hardware,  because  of  the  four-bit  slice 
structure  of  the  2901. 

The  mantissa  is  also  conveniently  handled  once  the  exponent  is  stripped 
away.  The  eight  bits  in  the  exponent  can  be  directly  zeroed  out  without 
altering  the  mantissa  value,  as  they  are  located  in  the  least  significant 
portion  of  the  32-bit  word.  Mantissa  overflow  in.  addition  or  subtraction 
is  also  obtainable  with  no  extra  hardware. 

Floating-point  arithmetic  becomes  much  more  difficult  with  the  AFAL 
number  representation.  The  exponent  does  not  fall  on  an  eight-bit  bound- 
ary, making  normal  operations  on  it  (adding  or  subtracting  for  multiply 
and  divide,  or  direct  number  scaling)  somewhat  more  difficult.  Also, 
special  hardware  must  be  added  to  detect  exponent  overflow  or  underflow. 

More  hardware  and/or  firmware  is  necessary  to  strip  this  exponent  away 
for  computation. 

Mantissa  handling  is  also  more  difficult.  The  eight  exponent  bits  can 
no  longer  be  simply  zeroed  out,  as  they  are  located  in  the  most  signifi- 
cant portion  of  the  fraction.  Instead,  the  sign  bit  must  be  tested  and 
propagated  through  these  eight  bits.  This  requires  yet  more  special 
hardware.  And  still  more  extra  hardware  is  necessary  for  mantissa 
ove  rflow/unde rfLow  detection. 
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The  amount  of  extra  hardware  necessary  for  floating-point  computations 
(approximately  15%  of  the  parts  coant)  with  the  AFAL  representation  outweighs 
any  advantages  it  might  have  from  an  aesthetic  or  programmer's  view.  We 
recommend  the  use  of  the  Westinghou.se  representation  on  this  basis. 

2.  3.  3 Extended  Floating-Point  Arithmetic 

Two  extended  floating-point  formats  were  also  studied.  The  first  was  a 
three-word  format,  with  an  eight-bit  exponent  and  40  bits  of  mantissa, 
compared  to  24  for  the  single-precision  format.  The  second  was  a four- 
word  format,  with  56  bits  of  mantissa. 

At  approximately  three  and  one-half  binary  digits  per  decimal  digit  of 
accuracy,  roughly  seven  decimal  places  are  obtainable  from  the  single- 
precision format,  12  from  the  three-word  extended  notation,  and  17  from 
the  four-word  format. 

While  the  extended  floating-point  formats  do  afford  an  increase  in 
accuracy,  there  are  several  points  that  are  well-worth  pointing  out: 

a.  When  making  calculations  on  extended  floating-point  numbers, 
the  number  of  internal  registers  necessary  becomes  rather  large.  A 
multiply  instruction  with  a 48-bit  number  requires  six  registers;  for  64 
bits,  eight  registers  are  necessary.  This  can  severely  limit  the  usage  of 
other  available  registers  for  other  variables. 

b.  As  the  width  of  the  extended  format  increases,  the  amount  of 
extra  hardware  necessary  in  the  EAU  (Extended  Arithmetic  Unit)  increases 
drastically.  In  jumping  from  a 24-bit  mantissa  to  a 40-bit  length,  an 
extra  eight  bits  must  be  added  to  the  EAU,  which  is  of  32-bit  width.  This 
is  an  equivalent  of  10  to  12  16-pin  DIP  pack  equivalents.  And  to  go  to  56- 
bit  mantissas  from  40  bits,  another  16  bits  on  top  of  the  eight  already 
mentioned  are  necessary.  At  10  to  12  16-pin  packs  per  eight  bits,  it  would 
cost  30  to  36  16-pin  pack  equivalents  over  the  present  32-bit.  EAU  to 
process  the  o4~bit  format  over  the  32-bit  single  precision  notation. 
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c.  The  added  hardware  in  the  EAU  would  also  slow  down  calcula- 
tions in  the  single-precision  format.  Since  the  "extended"  EAU  would 
"use"  all  of  its  hardware  even  in  single-precision  mode,  several  clock 
times  may  be  wasted  in  clearing  out  or  sign-oxtending  the  upper  parts  of 
the  registers  not  used  in  single  precision. 

In  the  light  of  the  above  mentioned  complications,  realizing  that  the 
single-precision  format  is  accurate  enough  for  many  applications,  we  do 
not  recommend  implementing  the  extending  floating-point  formats. 

2.4  CONTEXT  SWITCHING 

Context  (or  Mode)  switching  refers  to  a major  change  in  the  processing 
"state"  assigned  to  the  computer,  as  would  often  be  encountered  at  soft- 
ware breakpoints. 

The  complete  "state"  of  the  computer  is  defined  by: 

a.  The  current  value  of  the  IC. 

b.  The  Interrupt  Mask. 

c.  The  Arithmetic  Flags  (Overflow,  Negative,  Zero) 

Context  switching  is  accomplished  by  an  orderly  replacement  of  these 
three  quantities  by  a new  set  corresponding  to  the  "new  state"  of  the 
computer.  Referring  to  these  three  quantities  as  Program  Status  Words 
(PSW's),  context  switching  is  performed  by  "loading  the  PSW's.  " 

Similarly,  interrupts  may  be  handled  in  the  same  fashion  by  simply  loading 
in  new  PSW's  to  define  an  interrupt  service  routine. 

2.  4.  I LPSW  Instruction 

A new  instruction  (LPSW)  would  be  added  to  load  the  three  PSW  words 
(IC,  Arithmetic  Flags,  Interrupt  Mask)  from  successive  memory  locations 
pointed  to  by  the,  effective  address.  The  instruction  would  be  32  bits  long 
and  of  the  format  below-. 


Execution  of  this  instruction  will  then  accomplish  context  switching. 
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2.  4,  2 Interruots 


In  keeping  with  the  concept  ot‘  context  switching,  the  hardware  interrupt 
sequence  would  be  altered.  The  present  DAIS  machine  uses  two  fixed 
memory  locations  to  vector  each  of  the  16  possible  levels  of  ‘interrupts. 

The  first  memory  location  would  be  re -defined  as  the  address  of  where  to 
store  the  current.  PSW's.  The  second  memory  location  would  be  similarly 
re-defined  to  be  tne  address  of  the  new  PSW's  to  be  loaded  into  the  com- 
puter. As  is  customary,  this  would  be  accomplished  under  hardware 
control. 


In  schematic  form,  an  interrupt  would  be  handled  as  follows: 


VECTOR  TABU;  LINKAGE 


Computer  Stats 
at  Time  of 
Interrupt 


Of  course,  a return  from  interrupt  would  be  accomplished  by  executing 
the  LPSW  instruction  using  the  value  (LPTR)  for  ? ti  address  field. 

2.  4.  3 Privillged  Modes 

In  data  processing  type  environments,  some  machine  instructions  may 
be  reserved  for  execution  by  "privileged"  users  only.  This  is  typically 
desirable  where  the  user  may  be  inexperienced  which  requires  that  the 
computer's  operating  system  must  be  protected.  However,  this  has  not 
generally  been  a problem  with  military  computers  due  to  the  high  level  of 
refinement  enjoyed  by  an  operational  program  prior  to  its  inclusion  in  an 
operational  environment. 
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i. 


Nevertheless,  should  a privileged  mode  of  operation  bo  desirable,  it  may 

be  entered  by  a control  bit  within  a PSW  word. 

2.  4.  4 Multiple  Register  Sets 

The  most  common  scheme  adopted  by  the  industry  is  to  offer  two  sets 
of  registers,  thus  allowing  one  to  be  used  for  processing  interrupts.  This 
obviates  the  necessity  of  storing  a machine  register  upon  interruption. 

Should  a second  set  of  working  registers  be  desirable,  its  selection  may 

be  indicated  by  a bit  in  a PSW. 

2.  4.  5 Extended  Memory  Addressing 

The  present  DAIS  addressing  capability  extends  to  16  bits,  or  65K  of 
memory.  This  can  be  extended  through  the  PSW  by  the  inclusion  of  a 
block  register  bit  or  bits  in  the  word.  Each  time  the  PSW  is  loaded,  a 
block  register  would  also  be  loaded  with  the  bit  value  in  the  PSW.  This 
register  would  hold  the  block  value  until  a new  PSW  is  loaded,  providing 
upper  bits  for  memory  referencing. 

We  recommend  a one-bit  block  register,  giving  up  to  130K  addressing. 
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2.  4.  6 PSW  .Formats 


The  thrive  PSW  words  would  be  of  the  format  below: 


Interrupt  MASK 


16 

t_  Highest  level 

(1  a On,  0 3 Off) 


N 

0 

z 

Rs 

i 

L“j 

□ 

X 


PSW1 


lowest  level 
(1  * On,  0- Off) 


•X  PSW2 


1 

Block  Register 
Interrupt  ( 1 » On) 

Mode  ( 1 = Exec,  0 3 User) 
Register  Set 
Zero  Flag 

Overflow  Flag 
Negative  Flag 


1C  at  Time  of  Interrupt 


16 


PSW3 


2.  4.  7 Re-Entrant  Subroutines 

Subroutines  are  defined  to  be  "Re-entrant’1  whenever  they  may  be 
interrupted  by  a hardware  interrupt  and  subsequently  called  prior  to  their 

completion  of  the  interrupted  computation.  Therefore,  all  intermediate 
results  from  an  interrupted  subroutine  must  be  saved  and  then  restored 

when  the  interrupted  subroutine  is  allowed  to  resume. 

If  intermediate  results  are  entirely  contained  within  the  register  set 
then  simply  preserving  the  register  set  upon  interruption  is  sufficient  for 
implementing  re-entrant  subroutines.  However,  if  intermediate  values 
are  held  in  scratch  memory,  then  this  memory  must  be  reserved  at  the 
time  of  interruption  (and  not  returned  for  use  as  common  scratch).  The 
collection  of  information  necessary  to  "re-enter"  an  interrupted  subroutine 
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(i.  e.  , the  intermediate  values,  etc.  ) at  the  point  of  interruption  is  said  to 
be  "Interrupt  Linkage.  11 

If  a re-entrant  subroutine  is  allowed  multiple  interrupts  then  multiple 
sets  of  interrupt  linkage  must  be  preserved. 

Not  all  subroutines  need  be  re-entrant.  (In  fact,  Westinghouse  software 
does  not  allow  re-entrant  subroutines  due  to  their  aforementioned 
complexity).  However,  a generalized  scheme  for  implementing  re-entrant 
subroutines  on  the  present  AYK-15  machine  will  be  presented.  Also, 
alternatives  to  the  present  implementation  will  be  presented. 

2.  4.  7.  I Subroutine  Argument  Passing 

By  convention,  arguments  will  be  pushed  onto  a STACK  prior  to  calling 
a subroutine.  Therefore,  if  N arguments  are  passed  to  a subroutine,  the 
calling  program  will  first  push  all  N arguments  onto  the  stack  prior  to 
call  ing  a subroutine.  Presumably  the  arguments  will  be  pushed  in  the  order 
the  subroutine  requires  their  use.  Also,  the  calling  program  will  assign 
a scratch  memory  area  to  the  subroutine  by  passing  a starting  address  to 
the  subroutine  as  an  argument. 

At  the  time  of  a subroutine  call,  the  stack  will  be  configured  as  follows: 


( STACK  PTR. 


(Argument  Used  Last) 


(Argument  Used  First) 


?.,  7.  2 Subroutine  Calls 

2.  4.  7.  2.  I Present  DAIS  - Subroutine  calls  are  performed  by  a jump 
subroutine  (JS1  instruction  (refer  to  DAIS  Processor  Support  Software, 
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p.  124).  The  return  linkage  is  placed  in  the  register  specified  by  the  F.l 
field  of  the  instruction.  As  described,  this  instruction  a Iso  implements 
the  subroutine  return.  Therefore,  at  the  beginning  of  a subroutine,  if  A2 
contains  the  return  linkage,  the  register  set  will  be  as  follows: 


AO 

AL 

A2 

A15 


RETURN  ADR 

T 

; 

blACK  ri  K.  — 

MEMORY 


TOP  OF  ST X 


If  nested  subroutines  are  allowed,  then  A2  must  be  saved  prior  to  the  next 
cal  l. 


2,  4.  7.  2.  2 Proposed  Change  - Alternately,  the  return  linkage  may  be 
placed  on  a STACK  so  that  returns  may  be  accumulated  to  accommodate 
re-entrant  code.  An  instruction  to  call  a subroutine  of  the  format  below 
would  be  necessary. 


JSR 

x-x 

R | 

AF 

- x . j 

16  9 8 5 4 1 16  1 

IC  - STK 

( R +AF  ) -*  IC 

X 


It  is  assumed  that  one  of  the  general  purpose  registers  would  be  an  implied 
s tack  pointe  r. 

The  calling  sequence  for  a subroutine  would  then  be: 


STK  ARGN 

STK  ARG  (N-l) 


STK  ARG  l 

JSR  SRTN 
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At  the  time  of  the  call  the  stack  would  be: 


Note  that  the  return  linkage  is  now  on  the  "top  of  the  stack.  " The 
subroutine  must  first  "pop  the  stack"  to  save  the  return  linkage  prior  to 
popping  any  arguments.  Thus  it  would  seem  preferable  to  simply  leave 
the  return  linkage  in  a register. 

Finally,  a RETURN  instruction  must  be  added  to  pop  the  return  linkage 
into  the  IC.  This,  however,  can  be  a short  instruction  since  all  addresses 
are  implied.  The  return  instruction  would  be: 


STK  PTR- 


ARG  #N 


ARG  r 2 


ARG  # 1 


IC 


RTRN 


X X 


16  9 8 5 4 1 


( Top  of  STK  ) -*•  IC 

Now  a complete  comparison  can  be  made  of  the  two  methods  of  handling 
return  linkage.  Consider  the  two  calling  and  return  sequences  shown 
be  low: 


Present  DAIS  Proposed  Change 

CALLING  PROGRAM  CALLING  PROGRAM 


JS 

A 2 

JSR 

SRTN 

SUBROUTINE 

SUBROUTINE 

SRTN 

SRTN  USTK 

TEMP 

SAVE  LINK 
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J 


O,  A2 


R i’R  is 


Total  word  to  Call  &c  Return  - 4 Total  words  to  call  & return  - 5 

If  we  compare  the  Subroutine  Overhead  (number  of  words  to  link  and  return 
from  a subroutine)  we  find  that  the  stacking  mechanism  requires  one  more 
word.  Therefore,  the  two  methods  seem  nearly  equivalent  in  terms  of 
software  efficiency. 

2.  4.  7.  3 Hardware  Implications 

Employing  a stacking  mechanism  for  subroutine  returns  requires  addi- 
tion of  the  RROM  as  specified  in  Section  2.  5.  2. 

The  PSH  and  POP  instructions  as  defined  in  Paragraph  2.  2.  6 would 
require  minor  hardware  modifications  to  the  present  AYK-15.  Table  4 
presents  the  summary  of  modifications  necessary  to  the  pre sent  A YK-l 5 
processor. 

Microcode  flowcharts  for  the  PSH,  POP,  and  LPSW  instructions  are 
presented  in  Paragraph  3.  3. 

2.  4.  7.  4 Interrupt  Routines 

Lf  re-entrant  subroutines  are  to  be  allowed,  then  a complete  saving  of 
machine  status  (arithmetic  flags,  registers,  and  IC)  is  necessary  upon 
receipt  of  a hardware  interrupt.  Further,  if  nested  subroutines  are  to  be 
allowed  then  stacking  of  interrupt  linkages  is  desirable. 

2.  4.  7,  4.  I Interrupt  Stacking  - Present  DAIS  - Interrupt  linkages  may  be 
stacked  in  the  present  DAIS  machine  by  use  of  the  3TK  and  SM  instructions, 
Recalling  the  interrupt  structure  of  DAIS, 

VECTOR  TABLE  LINKAGE  WORDS 
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linkage  words,  After  the  arithmetic  flags  and  incremented  IC  are  stored 
in  the  linkage  v'urds,  the  service  routine  is  begun  at  address  NEWIC. 

To  provide  complete  linkage  stacking  the  service  routine  will  be: 

’ BEGINNING  OF  INTERRUPT  SERVICE 
NEWIC  STK  A IS,  LI  .STACK  FLAGS 

ST K A15,  L2  . STACK  IC 

SM  15,  0,  A15  . STACK  REGISTERS 

AIM  A 15,  (17  ) . MOVE  STK  PTR 


BODY  OF  SERVICE  ROUTINE 


’ END  OF  SERVICE  ROUTINE 


SIM 

A 15, (I6l0) 

. MOVE  STK  PTR 

LM 

15,  0,  A15 

. RESTORE  REG. 

USTK 

A15,  L2 

USTK 

A15,  Li 

EXS 

LI 

. RETURN 

‘ END  INTERRUPT  SERVICE  ROUTINE 

2.  5 CONCLUSIONS 

2.  5.  I Summary  of  Proposed  Changes 

2.  5.  1.  I Utilizing  Only  Firmware  Changes 

As  can  be  seen  from  Tables  4 and  5,  the  only  modifications  which  can 
be  accommodated  on  the  present  DAIS  machine  with  no  hardware  impact  is 
register  indirect  addressing.  Hence,  if  this  were  the  only  modification 
made  to  the  present  DAIS  computer,  new'  microcode  could  be  added  to  the 
existing  machines  (provided  some  "S-types"  were  eliminated)  to  form  the 
nucleus  of  the  computer  family. 


30 


'•’ITT 


However,  as  discussed  in  section  2,  we  have  been  unable  to  achieve  the 
desired  level  of  software  efficiency  (307c  improvement  over  present  AYK-15) 
by  using  only  register  indirect  addressing  as  an  addition  to  the  present 
DAIS  baseline  instructions.  For  this  reason  we  would  conclude  that  firm- 
ware changes  alone  are  not  sufficient  to  satisfy  the  goals  of  this  study. 

2.  5.  1.  2 Utilizing  Hardware  and  Firmware  Changes 

Section  2,  illustrated  that  the  desired  improvement  in  software 
efficiency  can  be  achieved  by  the  addition  of  base  relative  addressing. 
Although  requiring  minimum  additional  hardware,  the  benefits  to  software 
efficiency  are  most  dramatic  ( ~3o%  improvement  over  present  AYK-15). 
Therefore,  we  would  recommend  that  the  hardware  changes  listed  in 
Paragraph  2. 5.1.1  be  incorporated  into  the  present  DAIS  machine. 

These  changes  would  require  the  alteration  of  MCI  and  MC2,  to  allow 
for  the  addition  of  the  RROM  and  S-Gates  as  shown  in  figures  2 through 

-*•  Also,  some  minimal  backpanel  wiring  changes  would  be  necessary 

between  MCI  and  MC2.  Although  requiring  changes  to  two  printed  wiring 
boards,  these  changes  are,  conceptually,  of  minimal  complexity. 

Therefore,  incorporation  of  the  hardware  changes  to  accommodate  base 
relative  addressing,  is  the  only  acceptable  alternative  to  achie ving.  the 
desired  increase  in  software  efficiency  and  should  be  incorporated  into 
the  present  AYK-15  machine. 

In  tables  4 through  7,  each  case  is  expressed  separately.  If  multi- 
ple cases  were  to  be  incorporated,  the  "costs"  in  the  columns  labeled 
microcode  required,  hardware  required,  labor,  and  parts  are  not  necessa- 
rily added.  For  example,  a memory  controller  card  would  require  new 
artwork  for  one  change  or  many  changes,  and  microcode  routines  would  be 
shared  for  different  changes.  If  necessary,  new  microcode  storage  would 
be  added. 
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Purpose  To  uansiate  the  RA.flB  helds  o*  the  6-8't  8ase  Re>a"v?  <o'mat 
to  register  addresses 


New  M o g 

Siyna  ~-5 


Additional  Hai 
C 


Ness  ROM 


Admt'onal  Hardware 
Changes 


di  eus 

2 1 2 16  pin  equivalent  packs 

MCI.  MC2.  backpanei 


77  0813' VA  ,3 


Figure  2 « RROM 


■Ti: 
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2.  5.  2 Final  Instruction  Set 


Table  2 illustrates  the  final  instruction  set  for  the  modified  DAIS 
computer  as  chosen  from  the  findings  of  this  study. 

2.  5.  3 5ubset  for  Low  Level  Machine 

When  choosing  an  insrruction  set  for  the  "low-level  machine"  of  the 
computer  family,  we  w.  dd  recommend  that  a subset  of  the  instructions 
of  Paragraph  2.  5.  2 be  chosen.  Further,  those  instructions  which  required 
unique  hardware  to  implement  should  be  excluded  from  this  set.  This  will 
:nable  the  low-level  machine  to  reach  a minimum  parts  count  with  the 
ensuing  advantages  of  low  volume,  power,  and  cost. 

In  keeping  with  this  goal,  we  would  recommend  the  elimination  of  the 
floating-point  instructions,  as  well  as  the  double  precision  multiplies  and 
divides.  Both  these  instruction  types  require  unique  hardware  due  to  their 
complexity. 

The  elimination  of  these  instructions  would  be  in  keeping  with  the  goal 
of  a low-level  machine  oriented  towards  the  simple,  fixed  point,  front-end 
processor. 

Table  2 illustrates,  in  instruction  matrix  form,  the  subset  of  instruct- 
ions 
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TABLE  2 

RECOMMENDED  INSTRUCTION  MNEMONICS  IN  MATRIX  FORM 


skitmn  n: 


MODIFICATIONS  TO  PRESENT  DAIS 


3.  t MICRO- CODE 

Th©  hardware  and  firmware  (micro-node)  Implications  of  modifying  the 
present  DAIS  machine  to  Include  the  r.aw  instructions,  addressing  schemes 
and  floating-point  arithmetic  formats  are  presented  In  tables  4,  5 and 
6,  respectively, 

3.  1.  I Instruction  Changes 

Each  instruction  option  (table  A)  and  addressing  mode  (table  i>)  is 
evaluated  with  respect  to  six  parameters. 

Codes:  The  number  of  Order  Type  Codes  required  for  the 
instruction  or  addressing  mode. 

TABLE  -» 

NEW  INSTRUCTION  EVALUATION 


Change 


OT  Codes 

Time  iwsocl 

CPU/iP 

MG  n p 

Hard  Roq'd 

Physical  Changes 

1.  PSH 

1 

(2,8  - 1.4  N) 

6 

10 

PROM 

RSAV 

MCI,  MC2 

CPU 

Backpanel 

2.  POP 

1 

(3.0  + 1,6  N) 

7 

9 

PROM 

RSAV 

MCI,  MC2 

CPU 

Back  panel 

3.  IPSW 

1 

3,8 

6 



15 

1 

l 

i 

INT 

Backpanel 

77'0819-TA'8 
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ms  DVJJi  IS  WEST  QUAMTY  VUC'U CAULK 
OU*  Y ifUWU.t!Uw  IY>  UDU  ., 


TABLE  J 

DAIS  STUDY  ADDRESSING  MODE  EVALUATION 
(SHEET  l OF  2) 


*OB*8M  MOD* 

’ H*[a|8T6W  iKOlHtCT 

f or"  I"  (u~J"Ri 

< 6 1»  w 


« ov  co''  ia  viMi  i » ini  fif>u  » B 


Ci'cird  numbt'i  '*tt'  to  nottt  tn  tc«t  tuition  3 l 

* A00fr«in«;  m<xJ«  mv«U94t*d  »*r  SOW  imtndmfnt  No  I 

• Add'tsvnq  mod*  o*'  SOW  ia>«n<jmtnt  No  2 


77  0819  TA  9 


JHXS  PA Olt  IS  KK!5T  QUALITY 
ratVM  C-Ol'Y  l-'UJUUSlU'UJ  iojjoo 


TAIU.K  ‘S 

PAIS  STUDY  API  >R  I'LSSlNCi  MODI-;  KVADUATION 
(Sir/.KT  2 Of  2) 


i<tlA« I hiilri  emf  umimmHt  tni  14UIU  h 


0 

0 

0 

© 


thupimoiA  t ) .MS  u\ijchi  n*  mp  uO  tpflitf  *‘»*il  ,,  p>ut|i am  wonts  411,1  40  tiiaitf  mommy  I'ontfiihpr 
«•  Wtlhlt 

M Uutuhl  lw  i U‘|0(|  when  ovtUnihny  UI»U>  h ihM  tome  4dditMimg  nodes  h.tvo  two  soli  nl  iml*i«s 
(e  y Immediate  Shin  (I  D »%  is 1 itK  duw-  two  hrtidWd**  methods  of  impietnenttilum  wiw»  yorthialod 
unit  U'b  rniiiUi  ot  e,*h  pruuntoo 

Although  the  mimhoi  of  memmy  continhft*  h pioyram  locations  lequuod  to*  Heqiste*  I iuf  if  tret 
*1  vbi  v logo  (40).  ihn  con  lit  tip  r educed  conttdftf  dhly  hv  a mu<  b ptudani  choii-fl  ol  loquio*  huIhbci 
initiiii  lioni  Spec  die  ally.  wh«»iwv»»i  a**  ‘S  type  mitiwchoo  Us  specified  m OAIS  compute* 
documental  mid  11  lequuod  to  have  d leyule*  indirect  toimai,  two  unique  meinm/  controller 
h |»  ogi  4m  locations  <pe  (ttquifed  (Ihv*e<wt>  1 1 “S  type " instructions  with  registB*  indued  formats 
at  siitx  ifted  by  thiMontidCt  SOW) 

Die  execution  timet  tor  the  it  die  Relative  Short  instructions  tiven  m Ulde  h n an  average  time  Die 
actual  timei  ii'u 


d Single  word  fetch  instructions  J6i«ter 
UP  codes  O0.0t.0J.03 
08.00.0 A.  0B 
OC.,OO.Of.Of 
10  1 1.12.13 
14.  lb,  16. 1 7 
JC.JO.Jf:  21 
30.31.3J.33 
34  3b.  38. 3/ 

It  Single  word  store  instructions  2.8  fitec 
OP  codes  04.0b.06, 07 

c Double  word  leleh  instruction!  2 6 wuk 
OP  Lodes  18.10, 1 A.  Hi 
70,?  1,22,2? 

J4.Jb.J6.27 

d Double  word  store  instructions  3 2>»soc 
OP  codes  10. ID. It  11' 

••  Jump  conditional.  rotative  J 0m  sec  no  branch/2.2  k see  branch 
OP  codes  33.39  3 A (Incromont) 

3G.3D  3L  (Decrement) 

1 Jump  1 el  a live  JOosec 
OP  axles  38 ‘Increment I 
3f  (Decrement) 

b)  rhe  Immediate  Long  Instructions  format  investigated  eliminates  indexed  immediate  long  instructions. 
This  moans  the  programmer  can  no  longer  do 

Rb  * 3 . f12  • DM  R?.  *3.  Rb 
as  one  instruction,  but  must  now  do 


AS •*  R2  - l R 
rl2  * 3 .R?  - AIM 

t ikowiso.  Hg  » R5  * 3 * R2  = AIM  R2.  3,  Rb 
but  must  alto  do,  H5  ■ * R2  =•  LR 

R2  * 3 - R->  =»  AIM 

The  instructions  MIM  (JO  x 16-31)  and  SDiM  (16  *4*  16  - 16)  are  not  presently  implemented  and 
the  microcode  nocessary  is  included  in  this  chart. 

© The  range  of  I is  1 v I ' 16,  therefore  the  programmer  and/or  assembler  will  have  to  code  the 
f o' lowing  values  for  I: 

Ijn  BIT  VALUES 
1 0000 
2 0001 


15  DIO 

16  1111 

The  CPU  hardware  will  add  1 to  I and  assign  the  correct  sign  as  designated  by  the  OP  code. 

(S)  The  preumt  DAIS  machine  architecture  contains  a 4-bit  condition  status  register  with  one-bit  allocated 
to  each  of  the  following  conditions: 

0.  Le;$  than  zero,  less  than  (condition) 

b.  Equal  zoro,  equal  (comparison) 

c.  Greater  than  zero,  greater  than  (comparison) 

d.  Overflow,  underflow,  abnormal,  etc 

This  does  not  accommodate  a jump  on  carry  condition.  However,  the  carry  result  is  available  from  the 
carry  save  flip-flop,  and  is  used  during  micro  code  branch  conditions.  By  specifying  a separate  op  code, 
new  micro  code  ca.i  be  written  to  generate  the  desired  Jump  On  Carry  instruction.  All  required  hardware 
exists,  only  firmware  changes  are  required. 

77-O810TA-1O 
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FLOATING  - POINT  INSTRUCTION 


b.  ADD  TIME  (Table  5 only):  A comparison  of  a single  precision 


add  time  for  the  addressing  mode  versus  the  comparable  time  for  a double 
word  instruction.  This  number  is  expressed  as  a ratio  with  the  double 
word  instruction  time  being  the  denominator. 

c.  Time  (Table  4 only):  The  execution  time  (in^sec.  ) required  for 

the  instruction. 

d.  CPU  - m~P:  The  number  of  CPU  ^-program  words  required  to 
implement  the  addressing  mode  or  instruction. 

e.  MC  - m-p:  The  number  of  Memory  Controller  ^-program  words 
required  to  implement  the  addressing  mode  or  instruction. 

f.  Hardware  Required:  The  additional  hardware  necessary  to  imple- 
ment the  addressing  mode  or  instruction  on  the  existing  DAIS  machine. 

g.  Physical  Changes:  The  modules  in  the  existing  DAIS  machine 
which  must  be  modified  to  accommodate*  logic  changes  in  order  to  implement 
the  addressing  mode  or  instruction. 

3.  1.  2 Changes  for  Floating-Point  Instruction  Formats 

The  changes  to  the  present  DAIS  computer  for  the  three  Floating-Point 
instruction  formats  are  shown  in  table  6. 

1.  Changes  required  are  for  adding  10  new  parts  for  the  expon- 
ent arithmetic  and  reconfiguring  the  three  boards.  However,  the  EAU  would 
still  consist  of  one  control  board  and  two  data  boards. 

2.  Reconfigure  EAU  functional  schematic  but  still  need  only  l 
control  board  and  2 data  boards.  Forty  new  hremory  controller  u-code 
locations  needed  to  handle  the  extra  mantissa  word.  Thirty-four  new  parts 
added  for  exponent  arithmetic  and  mantissa  arithmetic. 

3.  Reconfigure  EAU  functional  schematic  and  add  hardware  to 
accommodate  additional  mantissa  length.  For  this  format  the  EAU  will  be 
made  up  of  one  control  board  and  three  data  boards. 
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3.  2 HARDWARE/FIRMWARE  COST  SUMMARY 


The  comparative  costs  associated  with  the  evaluation  results  shown  in 
tables  4,  5 and  6 are  presented  in  table  Material  costs  are 

expressed  in  1977  dollars  for  modifying  one  computer.  Non-recurring  costs 
are  expressed  in  labor  hours  and  include: 

a.  electrical  and  micro-code  design 

b.  design  verification 

c.  design  documentation 

d.  printed  wiring  board  artwork  changes 

Recurring  costs  are  similarly  expressed  in  labor  hours  and  include: 

a.  assembly  and  test 

b.  matrix  plate  wiring  changes 

c.  system  functional  verification 

d.  system  acceptance  test 

3.  3 DETAILED  DOCUMENTATION 

The  20  new  instructions  for  the  DAIS  machine  are  listed  in  table  8- 
This  table  also  details  which  micro-code  routines  are  required  in  the  CPU, 
MC,  and  EAU.  The  instruction  description,  flow  charts  and  timing  dia- 
grams for  each  of  the  20  instructions  follow  table  8. 
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ABLE 


10 


COST  SUMMARY 


COSTS  ASSOCIATED  WITH  TABLE  4 


■ 

Parts  Cost  (S) 

Non-Recurring 
Labor  (HR) 

Recurring 
Labor  (HR) 

n 

■BHI 

4950 

1184 

109 

4950 

109 

3 

LPSW 

1300 

SB 

55 

COSTS  ASSOCIATED  WITH  TABLE 


Reg  Indr 


Reg  Indr 
w/Auto  Inc 


Immed  Short 


Immed  Short 


Jmp  Cond 
1C  Rel 


Jmp  Sub 
1C  Rel 


1C  Rel  Short 


6-Bit  Base  Rel 


Base  Rel  Short 


Immed  Long 


Parts  Cost  (S) 


665 


665 


1300 


710 

710 


,710 

710 


710 

710 


710 


710 

710 


710 


710 


Non-Recurring 
Labor  (HR) 


205 


50 


430 


800 

750 


800 

750 

‘80Cf 

-Z5CL 

790 


850 

760 


1000 


870 


Recuri  mg 
Labor  (HR) 


16 


16 


40 


206 

206 


206 

206 


206 

206 


206 


206 

206 


206 


206 


MNEMONIC: 


PSH 


OPCODE:  SB 


SHORT  NAME:  push  onto  stack 
FORMAT:  PSH  N,  RA 


DESCRIPTION:  The  contents  of  registers  R through  R are  pushed  onto 

a (a+jN ) 

a stack  in  memory  using  R15  as  the  stack  pointer.  When  completed,  R15 


is  incremented  by  N + L 

If  N = 0,  then  only  R is  pushed  onto  the  stack. 


77-OS  19-VA- !4 


Figure  5 , PSH  Instruction 
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.*■*»*-.  * -■ 
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MNEMONIC:  pCp  OPCODE:  9B 


SHORT  NAME: 

pop  from  s 

tack 

FORMAT: 

POP 

n,ra 

l 0 0 l 

H— 1 1 h 

l 0 l l 

— t 1 — H 

N 

L— HH — f— ■ 

R 

DESCRIPTION:  Register  R through  R are  loaded  sequentially  from  the 

a (a -Is) 

stack  in  memory  using  R15  as  the  stack  pointer.  When  completed,  R15  is 
loaded  with  (RL5-N-1).  The  CS  register  is  set  far  each  word  transferred. 
If  N =0 , then  only  R will  be  loaded. 


REGISTERS  AFFECTED:  R^  through  RN»  R15,  CS 
TIMING:  (3.  0 + l.  6N)  Msec 


48 


Figure  8 . POP  Timing  Diagram 


MNEMONIC: 


OP  CODE; 


,}-\S\V 


SHORT  NAME;  i^ad  program  status  words 

FORMAT:  I.PSW  A DDR 

LPSW  A DDR,  R X 


non- indexed 
Indexed 


|.j~~ " 

t t 0 | 0^01  1 1 t 1 . o 


— wnywMW 


RX 


1"  ~A  D D R DSS F IF  7 ~V  | 

Id  9^54  16  1 

DESCRIPTION:  The  current  three  program  status  words  are  replaced  by  three 


sequential  memory  words  located  at  the  effective  address. 

This  instruction  is  used  for  context  switching  and  as  a return  from  Inter- 
rupt. 


REGISTERS  AFFECTED:  IC,  CS 
TIMING:  4,4  MSec 
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4 IRSWX1 
EAR  — ►MORI 


PT4EN  * 0 
MROY  - I 


I CPU  — *•  CLPSW 


CUPSW1  1 MOONS 


01  * 1 — ►DO 


. ear  AOOR  for  PSW2  & 3 

MBUS  AOOR  FOR  PSW1 


CLPSW2  "i 

< 

S' 

MROY  » 1 

V 

I’ 

.arm...: 

YES 

MORI  — ► M8US 
PROCRO  * 1 
TOTIT  * 0 
08LEN  --  I 


01  ► 00 

DO  MASR  REG 
IOEN  » I 


, , LPSW5 

MOEN  =■ 

EVOUT 

OOOOUT 

0 

0 

* 0 



p LPSW6 

MSW  — ► M0R2 
LSW  — ► MORI 

MOEN  * 0 PSWI  -*  MORI 

MROY  ' 


MORI  ► 01 

ear  — MBUS  AOOR  for  PSW  2 & 3 

PROCRO  - I 


7 7-08 19- V A-  1 7 


Figure  10.  LPSW  Instruction 
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Figure  11'.  L.PSW  Timing  Diagram 


MNEMONIC: 


FAR 


OP  CODE: 


AS 


SHORT  NAME:  floating  point  ADD,  register-to-register 
FORMAT:  FAR  Rl,  R2 


1 0 l 0 1 0 0 0 


Rl 


R2 


DESCRIPTION:  The  floating  point  number  in  registers  R2  and  R2  plus  one  is 
a dded  to  the  content  of  registers  Rl  and  Rl  plus  one.  The  conditions  status, 
CS,  is  set  based  on  the  floating  point  result  in  registers  Rl  and  Rl  + l and 
overflow.  Overflow  is  defined  as  exponent  overflow  or  underflow  during  the 
operation.  Upon  overflow  or  underflow  a floating  point  zero,  00000080,  is 
the  result,  Rl  and  R2  must  be  even. 


REGISTERS  AFFECTED:  Rl,  Rl  l,  CS 
4.  2 

TIMING: 
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TYPE  - PS  'REGISTER  TO  REGISTER  SPECIAL! 


CPU:  RSI'762) 


CPU:  RS2  (763) 


J 

i 


R2)  ► MORI 


R2)  ► MOR2 


77-08  19  — V A—  1 8 


Figure  12  , TYPE  - PS  (Register  to  Register  Special) 
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Figure  13.  FAR  Timing  Diagram 


I'  • • 


MNEMONIC: 


FSR 


OP  CODE: 


B8 


SHORT  NAME.  floating  subtract,  register-to-register 
FORMAT:  FSR  Rl,  R2 


DESCRIPTION:  The  floating  point  number  in  register  R2  and  R2  + 1 is  sub- 
tracted from  the  floating  point  number  in  register-Rl  and  register  Ri  + l. 
The  difference  remains  in  registers  Rl  and  Rl+  1.  The  condition  status, 

CS,  is  set  based  on  the  floating  point  result  in  registers  Rl  and  R1+  1 and 
overflow.  Overflow  is  defined  as  exponent  overflow  or  underflow  during 
the  operation.  Upon  overflow  or  underflow  a floating  point  zero,  00000080, 
is  the  result.  Rl  and  R2  must  be  even. 


REGISTERS  AFFECTED:  Rl.Rl+l,  CS 
TIMING:  4'  2 
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L 


TYPE  - PS  (REGISTER  TO  REGISTER  SPECIAL) 


CPU:  RSI  (762) 


CPU:  RS2I763) 


R2| 


R2j 


MORI 


VOR2 


77— 0819— VA— 18 


Figure  16  . TYPE  PS  (Register  to  Register  Special) 
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Figure  17  . FSR  Timing  Diagram 


FLOATING  POINT  SUBTRACT.  R) 't.j!  -MEM  IgA.  EA-1!  -* 


77—08 19— VA— 21 


Figure  18  - FSR  Instruction 
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Figure  19  • FSR  Instruction 


MNEMONIC: 


FMR 


OPCODE:  c8 


SHORT  NAME:  floating  multiply,  register-to-register 
FORMAT:  FMR  Rl,  R2 


l l 0 0 l 0 0 0 

Rl 

R2 

L-H — 1 — I — HH — 1 

— | — f— H — 

i ( 1 

DESCRIPTION:  The  floating  point  number  in  registers  R2  and  R2  + 1 is  multi- 
plied by  the  floating  point  number  in  registers  Rl  and  Rl  + l.  The  floating 
point  result  is  retained  in  registers  Rl  and  Rl  + l.  The  condition  status,  CS, 
is  set  based  on  the  floating  point  result  in  registers  Rl  and  Rl  + l and  over- 
flow, Overflow  is  defined  as  exponent  overflow  or  underflow  during  the 
operation.  Upon  overflow  or  underflow  a floating  point  zero.  00000080,  is 
the  result.  Rl  and  R2  must  be  even. 


REGISTERS  AFFECTED:  Rl,  Rl  + l,  CS 
TIMING:  5*6 
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TYPE  - PS  (REGISTER  TO  REGISTER  SPECIAL! 


CPU:  R3K762) 


CPU:  RS21763) 


R2j  — -►  MORI 


R2j  ► MOR2 


77-OS  19-VA-1R 


Figure  20  . TYPE  - PS  (Register  to  Register  Special) 


66 


Figure  21'.  FMR  Timing  Diagram 


floating  point  multiply:  <r i i.  n jt» 


(MEMORY,  MEMORY  * I)  — <•  RIi.  Slj) 


77  — 08 19 —V  A— 23 


Figure  22  . FMR  Instruction 


MNEMONIC: 


FDR 


OP  CODE: 


D8 


SHORT  NAME:  floating  divide,  register- to-register 
FORMAT:  FDR  Rl,  R2 


l l 0 l l 0 0 0 


Rl 


R 2 


DESCRIPTION:  The  floating  point  number  in  registers  Rl  and  Rl+l  is  divided 
by  the  floating  point  number  in  registers  R2  and  R2+1.  The  floating  point 
quotient  is  retained  in  registers  Rl  and  Rl  + i.  The  condition  status,  CS, 
is  set  based  on  the  floating  point  result  in  registers  Rl  and  Rl  + l and  over- 
flow. Overflow  is  defined  as  exponent  overflow  or  underflow  during  the 
operation.  Upon  overflow  or  underflow  a floating  point  zero,  00000080,  is 
the  result.  Rl  and  R2  must  be  even. 


REGISTERS  AFFECTED:  Rl,  Rl  + l,  CS 
TIMING:  b*° 
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TYPE  - PS  (REGISTER  TO  REGISTER  SPECIAL) 


CPU:  RSI  (762) 


CPU:  R$2(7S3> 


R2  j 


R2| 


MORI 


MGR2 


77-0819-VA-I8 


Figure 


TYPE  - PS  (Register  to  Register  Special) 
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FLOATING  POINT  DIVIDE,  (flli,  fill)  (MEMORY,  MEMORY  *1)  - IRli,  Rlji 


TYPE 

QEJlflgl 

FO 

0? 

FOR 

08 

FOI 

09 

MS  OPERAND 


IS  OPERAND 


IS  OPERAND 


77  00 13VA  2S 


Fluura  ..v 


FDR  Lnscruction 


H DATING  rOfNI  OlVIOf  I T 1 * Q}  ▼ »*r 


a 

? 

e* 

1 


74 


Figure  27  . FDR  Instruction 


MNEMONIC: 


OP  CODE: 


F8 


FOR 

SHORTNAME:  floating  compare,  register-to- register 

FORMAT:  FCR  Rl,  R2 


l l l l l 0 0 0 

Rl 

R2 

— I- | — , — ) — | — | — 

— 1 — — 

— i — 4- — ■) — 

DESCRIPTION:  The  floating  point  number  in  registers  Rl  and  Rl  + l is  com- 
pared to  the  floating  point  number  in  registers  R2  and  R2  + l.  If  Rl<R2 
then  the  condition  status,  CS,  is  set  to  l (less  than).  If  Rl  = R2  then  CS  is 
set  to  2 (equal  to).  If  Rl  >R2  then  CS  is  set  to  4 (greater  than).  No  reg- 
isters are  changed.  Rl  and  R2  must  be  even. 


REGISTERS  AFFECTED:  CS 
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type  - PS  (REGISTER  to  REGISTER  SPECIAL) 


CPU:  RSI  (762) 


CPU:  RS2I763! 


R2|  ► MORI 


R2|  ► MOR2 


77-08  19-VA-18 


Figure  28  . TYPE  - PS  (Register  to  Register  Special) 
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Figure  29  . FCR  Timing  Diagram 


FLOATING  POINT  COMPARE:  R1  (i.  j)  - MEMORY  <EA,  EA  + 1)  - C.S. 


77  0819-VA  2 


Figure  30  . FCR  Instruction 


MNEMONIC: 


M 


OP  CODE: 


CO 


SHORT  NAME:  single  precision  multiply 


FORMAT: 


M Rl,  A DDR 

M Rl,  ADDR,  RX 


nonindexed 

indexed 


1 1 0 0 0 0 0 0 


Rl 


RX 

H — t — h 


ADDRESS  FIELD 


DESCRIPTION:  The  memory  operand  is  multiplied  by  the  content  of  register 

Rl.  The  high  order  part  of  the  product  is  retained  in  register  Rl:  the 
lower  order  part  of  the  product  is  retained  in  register  Rl  + 1. 

The  condition  status,  CS,  is  set  based  on  the  result.  If  RX  is  0,  then  the 
16-bit  address  field  is  used  as  a memory  address  to  obtain  the  memory 
operand.  If  RX  is  nonzero,  then  the  content  of  register  RX  is  added  to  the 
16-bit  address  field  and  the  resulting  sum  is  used  as  a memory  address  to 
obtain  the  memory  operand. 


REGISTERS  AFFECTED:  Rlt  ri  + 1,  CS 
TIMING:  4,0 
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TYPE  - 0 (DIRECT  MEM.  ACCESS  INSTRUCTION) 


TYPE  - OE  (DIRECT  MEM.  ACCESS,  EARLY  CPU  RELEASE) 


CPU:  X + AF 

EARLY 

RELEASE 


77-C819-VA-28 


Figure  31  . TYPE  - D (Direct  Memory  Access  Instruction) 
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99-VA-6I80V/ 


SINGLE  PRECISION  MULTIPLY:  Rli  ‘MEMORY  — t.  Rli 


TYPE 

OP  COO  E 

M 

CO 

Ml 

C2 

MB 

10.11,12,13 

MIM 

C3 

77-08 19-VA-29 


Figure  33  . M Instruction 
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fractional  multiply 


77  — 081 9— VA— 30 


Figure  34 


M Instruction 


MNEMONIC: 


MR 


OPCODE:  Cl 


SHORT  NAME:  single  precision  multiply,  register-to-register 
FORMAT:  MR  Rl,  R2 


- T"  1 — 

1 l 0 0 0 0 0 1 

! , Rl 

R2 

— 1 — , f— i — 1 — H-i — 

I — 1 — HrH — 

— 1— 

4-h — 

DESCRIPTION:  The  content  of  register  R2  is  multiplied  by  the  content  of 
register  Rl  and  the  product  is  retained  in  register  Rl  and  Rl+1.  The  con- 
dition status,  CS,  is  set  based  on  the  result. 


REGISTERS  AFFECTED:  Rl,  Rl  + l,  CS 
TIMING: 
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TYPE  - n (REGISTER  TO  REGISTER  INSTRUCTION) 


Figure 


C"D 


TPR  406 


77  -08 1 9- V A-3 1 


TYPE  - R (Register  to  Register  Instruction) 
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Ij 
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36..  MR  Timing  Diagram 


SINGLE  PRECISION  MULTIPLY:  REGISTER  TO  REGISTER:  Rli  - R2i  R1 


TYPE 

MR 


OP  CODE 
Cl 


C “D 


MR1  ^ 

r 526 

Rli- 

* DO 

MR2  1 

r 204 

R2i-  DO 

] 

r 

r^r) 


77-08 19-VA-32 


Figure  37.  MR  Instruction 
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MNEMONIC: 


MI 


OPCODE:  C2 


SHORT  NAME:  single  precision  multiply  indirect 

FORMAT:  MI  Rl,  ADDR  nonindexed 

MI  Rl,  ADDR,  RX  indexed 


1 1 0 0 0 0 1 0 

Rl 

RX 

ADDRESS  FIELD 

t 1 1 1 

1 t~~t 

L t-  11 

DESCRIPTION:  The  memory  operand  is  multiplied  by  the  content  of  register 
RL.  The  product  is  retained  in  register  Rl  and  Rl+i.  The  condition  status, 
CS,  is  set  based  on  the  result. 

If  RX  is  0,  then  the  16-bit  address  field  is  used  to  fetch  a memory  address. 
This  memory  address  is  used  to  obtain  the  memory  operand.  If  RX  is 
nonzero,  then  the  16 -bit  address  field  is  used  to  fetch  an  address.  The 
content  of  register  RX  is  added  to  the  fetched  address  and  the  resulting  sum 
is  used  as  a memory  address  to  obtain  the  memory  operand. 


REGISTERS  AFFECTED:  Rl,  CS 
TIMING;  5-° 
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TYPE  - I (INDIRECT  MEM.  ACCESS  INSTRUCTION) 


CPU:  BUS2X 


c 


TPIX 


3 


TPIX1  42? 


I 


TPIX2  430 


MOR2  + RX-DO 
EARSEL*  00 


-M- 


C 


I 


c 


TPI 


I 


3 


TPI1  423 


08US-  EAR 

OBUS-  EAR 

INDIRECT  AO 

OMEM-  M0R2 

OMEM  -*•  M0R2 

DIRECT  AO 

IVJMP-0 

IVJMP  » 0 

JADO  * BUS2X 

JADD  * BUS2 

MCRDY  = 1 

MCRDY  « 1 

TPI2  424 


M0R2  - 00 
EARSEL  •=  00 


CPU:  BUSS 


415 


TP0X2 


J> 


TYPE  - IE  (INDIRECT  MEM.  ACCESS,  EARLY  CPU  RELEASE) 


77  08 19-VA  34 


Figure  39  . Type  - I (Indirect  Memory  Access  Instruction) 
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agr 


SINGLE  PRECISION  MULTIPLY:  Rli  ‘MEMORY  — » Rli 


TYPE 

OP 

COO  E 

V 

CO 

Ml 

C2 

MB 

10. 

1 1,12,13 

MIM 

C3 

77-08 19 -VA-29 


Figure  41  . MI  Instruction 


MNEMONIC:  D 

SHORT  NAME:  single  precision  divide 
FORMAT: 


OPCODE:  DO 


D 

D 


Rl,  ADDR 
Rl,  ADDR,  RX 


nonindexed 

indexed 


1 l 0 1 0 0 0 0 


Rl 


I t i" 


RX 


ADDRESS  FIELD 


1 


16 


16 9 8 1 1 ' 5 4 

DESCRIPTION:  The  content  of  register  Rl  and  Rl  + 1 is  divided  by  the  memory 
operand.  The  quotient  is  retained  in  register  Rl  and  the  remainder  is 
retained  in  register  Rl  + i.  Overflow  occurs  if  the  magnitude  of  the  num- 
ber in  storage  is  equal  or  less  than  the  magnitude  in  register  Rl. 

The  condition  status,  CS,  is  set  based  on  the  result  in  register  Rl  and 
overflow.  Lf  RX  is  0,  then  the  16-bit  address  field  is  used  as  a memory 
address  to  obtain  the  memory  operand.  If  RX  is  nonzero,  then  the  content 
of  register  RX  is  added  to  the  16-bit  address  field  and  the  resulting  sum  is 
used  as  a memory  address  to  obtain  the  memory  operand.  Rl  must  be  even. 


REGISTERS  AFFECTED:  Rl,  Rl  + l,  CS 
TIMING;  4-  2 


94 


TYPE  - 0 (DIRECT  MEM.  ACCESS  INSTRUCTION) 


TYPE  - DE  (DIRECT  MEM.  ACCESS,  EARLY  CPU  RELEASE) 


CPU:  X + AF 

EARLY 

RELEASE 


77-0819-VA-28 


43  . Type  - D (Direct  Memory  Access  Instruction) 


Figure  ^4  . D Timing  Diagram 


SINGLE  PRECISION  OIVIOE,  REGISTER  TO  MEMORY:  (Rli,  R1|>  ^ MEMORY 
UN  EAU) 


TYPE 

D 

01 

OB 

DIM 


(Rli,  Rl|) 
(Q,  R> 


77  0819  VA-36 


Figure 


D Instruction 


7 7-08 19- V A-4 1 

Figure  46  . D Instruction 
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MNEMONIC: 


DR 


OPCODE: 


Dl 


SHORT  NAME:  single  precision  divide,  register-to-register 
FORMAT:  DR  Ri,  R2 


l 1 0 l 0 0 0 1 


Rl 


R2 


DESCRIPTION:  The  content  of  registers  Rl  and  Rl  4-  1 is  divided  by  the  con- 
tent of  register  R2.  The  quotient  is  retained  in  register  Rl  and  the 
remainder  is  retained  in  register  Rl  plus  one.  The  condition  status,  CS, 
is  set  based  on  the  result  in  register  Rl  and  overflow.  Rl  must  be  even. 


it 


REGISTERS  AFFECTED:  Rl,  Rl  + L,  CS 
TIMING:  4- 0 
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TYPE  - R (REGISTER  TO  REGISTER  INSTRUCTION) 


77-0819-VA-31 


Figure  47  . Type  - R (Register  to  Register  Instruction) 
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Figure  48  . DR  Timing  Diagram 


SINGLE  PRECISION  DIVIDE,  REGISTER  TO  REGISTER:  (Rli.RIjK  R2i 


TYPE 

OR 


OP  CODE 
01 


C °a  ) 


OR  1 

r 536 

Rlj—*1  DO 

0R2  ^ 

T 235 

0R3  1 

r 236 

R2i-+  DO 

1 

f 

3c 

C °4  ^ 


77-0819-VA-38 


(Rli,  Rlj) 
Q.R 


Figure  49 


DR  Instruction 
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MNEMONIC: 


DI 


OPCODE:  D2 


SHORT  NAME:  single  precision  divide  indirect 


FORMAT: 


DI  Rl,  ADDR  nonindexed 

DI  Rl,  ADDR,  RX  indexed 


■4* 


l 1 0 l 0 0 l 0 


Rl 


RX 


ADDRESS  FIELD 


DESCRIPTION:  The  content  of  register  Rl  and  Rl  + 1 is  divided  by  the  memory 
operand.  The  quotient  is  retained  in  register  Rl  and  the  remainder  is 
retained  in  register  Rl  + l.  The  condition  status,  CS,  is  set  based  on  the 
result  in  register  Rl  and  overflow.  Rl  must  be  even. 

If  RX  is  0,  then  the  16-bit  address  field  is  used  to  fetch  memory  address. 
This  memory  address  is  used  to  obtain  the  memory  operand.  If  RX  is 
nonzero,  then  the  16-bit  address  field  is  used  to  fetch  an  address.  The 
content  of  register  RX  is  added  to  the  fetched  address  and  the  resulting 
is  used  as  a memory  address  to  obtain  the  memory  operand. 


REGISTERS  AFFECTED:  Rl,  Rl  + l,  CS 
TIMING:  5- 2 
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TYPE  - I (INOIRECT  MEM.  ACCESS  INSTRUCTION) 


INDIRECT  AO 
DIRECT  AO 


CPU:  8US2X 


CPU:  BUSS 


TYPE  - IE  (INOIRECT  MEM.  ACCESS,  EARLY  CPU  RELEASE) 


77  08 1 9-VA-34 


Figure  51  . Type  - I (Indirect  Memory  Access  Instruction) 

10 


L 


* 


5 


77  08  I9-VA-60 


MNEMONIC: 


DABS 


OPCODE:  AC 


SHORT  NAME:  double  pr  ecision  absolute  value  register  to  register 

FORMAT:  DABS  Rl,  R2 

DABS  Rl 


ifMMXMMMM 

Rl 

1 L l 

R2 

i > I i i i 1 I ] ! 1 

DESCRIPTION:  If  the  sign  bit  of  register  R2  is  a one,  then  double  precision 
negate  register  R2,  R2  + 1 and  place  result  in  Rl  and  Rl  +1,  otherwise 
place  R2,  R2  + l in  Rl,  Rl+  l,  respectively,  Rl  and  R2  must  be  even.  Rl 
may  equal  R2. 


REGISTERS  AFFECTED:  Rl,  Rl  + l,  CS 
TIMING:  l- 6 
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TYPE  - R (REGISTER  TO  REGISTER  INSTRUCTION) 


C TPB  ) 


I 


TPR406 


MCRDY  * 1 
ST1  • 1 
ST2-  1 
LDMCAD  * 1 
WT4RDY  « 0 


( IDLE  ) 


77-08 19-VA-31 


Figure  55  . Type  - R (Register  to  Register  Instruction) 
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DABS  Timing  Diagram 


DOUBLE  PRECISION  ABSOLUTE  VALUE:  | R2(i.  ]){  - R1(i,  j) 


77-0819-VA-42 


Figure  57  . DABS  Instruction 


HZ 


MNEMONIC:  DNEG 

SHORT  NAME:  negate  double  precision  register 

FORMAT:  DNEG  Rl,  R2 

DNEG  Rl 


l 0 1 


0 0 


Rl 


R2 


OP  CODE: 


BC 


DESCRIPTION:  The  content  of  register  R2  and  Register  R2  + l is  negated. 

The  result,  the  negative  of  the  original  double  precision  number,  is  placed 
in  Rl  and  Rl  + 1.  R2  may  be  equal  to  Rl.  The  condition  status,  CS,  is  set 
based  on  the  double  precision  result  in  registers  Rl  and  Rl  + l and  overflow. 
Rl  and  R2  must  be  even. 


REGISTERS  AFFECTED:  Rl.  Rl  + l,  CS 
TIMING:  L 4 
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TYPE  - R (REGISTER  TO  REGISTER  INSTRUCTION) 


77-0819-VA-31 


Figure  53  . Type  - R (Register  to  Register  Instruction) 

114 


Figure  59  „ DNEG  Timing  Diagr 


Rl  (i,  I) 


ISW 


MSW 


77'0819-VA43 


Figure  60  . DNEG  Instruction 


, t 
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MNEMONIC: 


SRC 


OPCODE: 


64 


SHORT  NAME:  shift  right  cyclic 
FORMAT:  SRC  R2,  N 


DESCRIPTION:  The  content  of  register  R2  is  shifted  right  cyclically  N posi- 
tions. The  field  N-l  being  zero  represents  a shift  of  1 position.  The  field 
N-l  being  15  represents  a shift  of  16  positions.  Bits  shifted  out  of  the  least 
significant  bit  position  enter  the  sign  position.  No  bits  are  lost.  The 
condition  status,  CS,  is  set  based  on  the  result  in  register  R2.  R2  may  be 
any  general  register.  The  assembler  subtracts  1 from  the  programs  value 
of  N and  places  N-l  in  the  4 bit  field. 

Result  in  Register  Resulting  Condition  Status 


R2 

Bits 

Hex 

JC  Mnemonic 

0 

0010 

2 

EZ 

sign  bit  = 1 

0001 

1 

LZ 

otherwise 

0100 

4 

GZ 

REGISTERS  AFFECTED:  R2f  CS 
TIMING:  l.  4 + 0.  4 per  poaitLon 
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TYPE  - R (REGISTER  TO  REGISTER  INSTRUCTION) 


77-08 19-VA-31 


Figure  61  . Type  - R (Register  to  Register  Instruction) 
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Figure  (,)  . SRC  Timing  Diag 


SHIFT  RIGHT:  R2I  - R2i  (SHIFTED  RIGHT  N • 1 TIMES) 


77-0819-VA44 


Fiffure  63  , SRC  Instruction 


MNEMONIC: 


OPCOOE:  65 


DS  LI- 

SHORT  NAME:  double  shift  left  logical 
FORMAT:  DSLI.  R2,  N 


0 l 1 0 0 l 0 l 


N-l 


R2 


DESCRIPTION:  The  content  of  registers  R2  and  R2  + l are  shifted  left  logical 
N positions.  The  field  N-l  being  zero  represents  a shift  of  1 position.  The 
field  N-l  being  15  represents  a shift  of  16  positions.  Zeros  enter  the  least 
significant  position  of  register  R2  + U Eits  shifted  out  of  the  sign  position 
of  register  R*  + l enter  the  least  significant  position  of  register  R2.  Bits 
shifted  out  of  the  sign  position  of  register  R2  are  lost.  The  condition 
status,  CS,  is  set  based  on  the  double  precision  result  in  registers  R2  and 
R2  + 1.  R2  must  be  even, 


Result  in  Registers  Resulting  Condition  Status 


R2,  R2  + 1 

Bits 

Hex 

JC  Mnemonic 

both  zero 

0010 

2 

EZ 

2 

sign  bit  of  R = l 

0001 

l 

LZ 

otherwise 

0100 

4 

GZ 

REGISTERS  AFFECTED:  R2,  R2+L,  CS 
TIMING:  l.  8 +0.  4 per  position 


121 


TYPE  - R (REGISTER  TO  REGISTER  INSTRUCTION) 

C™D 

U ir  TPR  406 


MCRDY * 1 
ST1  ■ 1 
ST2  * 1 
LDMCAD  * 1 
WT4RDY-0 


Ge5D 

77 -08 19-VA-3 1 


igure  64  , Type  - R (Register  to  Register  Instruction) 
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Figure  f> 5 . DSLL  Timing  Diagram 


DOUBLE  PRECISION  SHIFT  LEFT:  R2i,  R2j  — R2i,  R2j  (SHIFTED  N+  1 TIMES) 


'E  OP  CODE 


77-0819-VA45 


Figure  66.  DSLL  Instruction 
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MNEMONIC: 


DSRA 


OPCODE:  67 


SHORT  NAME:  double  shift  right  arithmetic 


FORMAT:  DSRA 

R2,  N 

Oil  0 0 1,1  1 

1 1 1 L 

! N-l 

1 1 » 

R2 

j , | , , , , | | 1 1 1 1 

DESCRIPTION:  The  content  of  registers  R2  and  R2  + 1 is  shifted  right  arith- 
metic N positions.  The  field  N-l  being  zero  represents  a shift  of  1 
position.  The  field  N-l  being  15  represents  a shift  of  16  positions.  The 
sign  position  of  register  R2  is  not  shifted.  The  sign  bit  is  copied  into  the 
next  position  for  each  bit  shifted.  Bits  leaving  the  least  significant  posi- 
tion of  register  R2  enter  the  sign  position  of  register  R2+1.  Bits  leaving 
the  least  significant  position  of  register  R2  + l are  lost.  The  condition 
status,  CS,  is  set  based  on  the  double  precision  result  in  registers  R2  and 
R2  +1.  R2  must  be  even. 

Result  in  Registers  Resulting  Condition  Status 


R2,  R2  + 1 

Bits 

Hex 

JC  Mnemonic 

both  zero 

0010 

2 

EZ 

sign  bit  of  R2  = l 

0001 

1 

LZ 

otherwise 

0100 

4 

GZ 

REGISTERS  AFFECTED:  R2,  R2  + l,  cs 
TIMING:  1*  8 +0.  4 per  position 
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TYPE  - R (REGISTER  TO  REGISTER  INSTRUCTION) 


MCRDY  » 1 
ST1  ■ 1 
ST2-  1 
LOMCAD-  1 
WT4ROY-0 


GiD 

77-08 19-VA-3I 


Figure  67  . Type  - R (Register  to  Register  Instruction) 
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DSRA  Timing  Diagram 


OOUBIE  PRECISION  SHIFT  RIGHT;  R2-.  R2j  - R2i,  R2|  'SHIFTED  N ♦ 1 TIMESI 


77-0819-VA46 


Figure  69.  DSRA  Instruction 


MNEMONIC;  DSRC  OPCODE!  ^ 

SHORT  NAME:  double  shift  right  t;  ye  lie 
FORMAT;  DSRC  R2,  N 


01  l 0*1001 

N-l  R2  1 

H—t1  -""H— 4*— J 

}— | 1= 

DESCRIPTION;  The  content  of  registers  R2  anti  R2  *1  is  shifted  right  cycli- 
cally N positions.  The  field  N-l  being  aero  represents  a shift  of  l position. 
The  field  N-l  being  IS  represents  a shift  of  16  positions.  Bits  leaving  the 
least  significant  position  of  register  R2  + 1 enter  the  sign  position  of  regi- 
ster R2,  Bits  leaving  the  least  significant  position  of  register  R2  enter  the 
sign  position  of  register  R2  + 1.  No  bits  are  lost.  The  condition  status,  CS, 
is  set  baaed  on  the  double  precision  result  In  registers  R2  and  R2-M.  R2 
must  be  even, 


Result  in  Registers  Resulting  Condition  Status 


R2,  R2  + 1 

Bits 

Hex 

JC  Mnemonic 

both  aero 

0010 

2 

EZ 

sign  bit  of  R2  a i 

0001 

l 

LZ 

otherwise 

0100 

4 

QZ 

REGISTERS  AFFECTED;  R2,  R2  + l, 

CS 

TIMING;  l.  8 + 0 . 4 per  position 
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TYP6  - H (REGISTER  TO  REGISTER  INSTRUCTION) 


C”0 

T 


TPR4QQ 


MCRQY • I 
ST  1 • t 
ST2  * 1 
IDMCAQ » I 
WT4RDY-0 


GiD 


77.O8t9.VA.31 


iguro  70.  Type  - R (Register  to  Register  Instruction) 
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-T- 


DOUBLE  PRECISION  SHIFT  RIGHT:  R2i,  R2j  - R2i,  R2j  (SHIFTED  N + 1 TIMES) 


[E 

IL 

;a 

c 


OP  CODE 


77-0819-VA46 


Figure  72.  DSCR  Instruction 


SECTION  IV 


LOW-LEVEL  MACHINE  (LLM)  DESIGN 

4.  I SCOPE  OF  DESIGN 

The  results  of  the  software  analysis  performed  during  this  contract 
provided  the  natural  foundation  for  a family  of  airborne  digital  computers. 
With  appropriate  modifications,  the  present  AYK-15  computer  would 
become  the  high  performance  member  of  the  computer  family.  However, 
the  "low  end"  or  lower  performance  members  of  the  family  were  yet  to  be 
defined.  It  is  felt  by  both  the  Air  Force  and  Westinghouse,  that  this  "Low 
Level"  machine  should  be  instruction  set  compatible  with  the  higher  mem- 
bers of  the  family,  while  minimizing  cost,  power  and  volume;  and  still 
using  the  same  support  software  package  and  facilities. 

To  this  end,  an  investigation  and  block  level  design  was  performed  to 
more  fully  define  the  characteristics  of  this  Low-Level  Machine  (LLM). 
This  investigation  resulted  in  a detailed  study  of  machine  architectures 
suitable  for  the  LLM  implementation  as  well  as  an  I/O  interconnect 
definition  (I- BUS)  amenable  to  I/O  expansion  and  CPU  interconnection 
(multiprocessing).  The  results  of  this  investigation  are  part  count., 
power  and  execution  time  estimates  for  the  proposed  LLM, 

What  follows  is  a summary  of  this  investigation  which  concludes  with  a 
block  level  description  of  the  proposed  LLM. 

4.  2 APPLICATION  BASE  OF  LLM 

The  first  step  in  the  LLM  investigation  was  to  define  the  type  of  problem 
to  be  solved  by  the  LLM.  Since  the  computer  is  intended  to  be  used  in  a 
multitude  of  applications,  an  application  base  had  to  be  defined  for  the  new 
machine  in  order  to  limit  the  scope  of  the  investigation.  With  the  help  and 
experience  of  AFAL,  it  was  decided  that  the  LLM  should  be  used  primarily 
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In  * multicomputer  avionic#  environment.  It  would,  therefore,  perform 
pre-processing  of  sanaor  data  prior  to  transmission  of  tha  data  to  other 
processors  within  tha  system,  Similarly,  tha  LLM  would  perform  any 
post  processing  naceaaary  for  actuator  data.  Figure  73  illustrate#  a 
desired  application  environment  for  the  LLM. 

Since  the  sensor/actuator  requirements  may  be  quite  diverse  from  one 
aircraft  to  another,  the  LLM  should  also  provide  an  efficient  means  of 
interconnection  of  groups  of  LLM's  to  modularly  expand  the  data  handling 
capability  of  the  sensor/actuator  system.  Therefore,  as  the  number  of 
sensors  for  the  system  increases,  additional  LLM's  may  be  added  in  a 
modular  building  block  fashion  as  illustrated  in  figure  73. 

Using  this  application  model  as  a starting  point,  past  programs  were 
reviewed  by  AFAL  and  Westinghouse  in  order  to  establish  the  throughput 
required  for  the  LLM.  With  the  throughput  defined,  a set  of  design  goals 
were  then  established  for  the  LLM. 

4.3  DESIGN  GOALS 

A set  of  five  design  goals  were  established  to  provide  guidelines  for  the 
LLM  design.  They  were: 

a.  Upward  software  campatability  with  DAIS  (AYK-15) 

b.  2.  5 to  5.  0 naec  16-bit  fixed-point  ADD 

c.  Universal  memory  interface 

d.  I-BUS  I/O  design 

e.  Minimize  volume  and  power 

Software  compatability  with  the  modified  AYK-15  machine,  was  given 
the  highest  priority  as  a design  goal  in  order  to  take  advantage  of  the 
software  support  developed  for  the  AYK-15.  However,  wherever  necess- 
ary, instructions  were  omitted  from  the  LLM  to  simplify  its  structure  and 
minimize  the  parts  count.  As  a result,  the  LLM  became  "upward  compat- 
able"  with  modified  AYK-15  computer.  (See  Paragraph  2.  5.  3,  Subset 
for  LLM). 
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After  reviewing  the  application  bases  for  the  LLM,  the  goal  of  2.  5 
p sec  was  deemed  suitable  for  a 16 -bit  fixed-point  ADD  execution  time. 

This  design  goal  permitted  a sizing  of  the  control  portion  of  the  LLM  to 
provide  a starting  point  for  the  design  effort. 

Because  the  LLM  was  intended  to  be  used  across  a wide  range  of  appli- 
cations, it  was  felt  that  the  ability  to  adapt  the  LLM  to  a particular 
application  by  varying  the  memory  organization  was  highly  attractive. 
Therefore,  a generalized  memory  interface  to  allow  for  varying  memory 
speeds  and  technologies  (IC  or  Core)  was  included  as  a design  goal. 

Ln  order  to  provide  for  modular  growth  of  the  I/O  and  a link  for  multi- 
processor application  structures,  the  I-BUS  approach  developed  by  AFAL 
(Final  Report,  Cont.  No.  F33615-74-C-1018)  was  adopted  as  the  standard 
I/O  interface. 

Finally,  in  order  to  reach  a maximum  application  base  it  was  deemed 
desirable  to  minimize  the  volume  of  the  LLM  by  use  of  available  LSI 
technology  wherever  practical.  To  this  end,  speed  and  performance  were 
sacrificed,  within  the  established  design  goals,  to  allow  for  a minimum 
parts  count  (and  hence  volume)  configuration. 

4.4  LLM  ORGANIZATION 
4.  4.  1 Arithmetic  Loop 

The  DAIS  instruction  set  is  organized  around  a general  register  machine 
utilizing  a group  of  16  general  registers.  This,  along  with  the  desired  speed 
goals  dictated  the  choice  of  the  AM-2901  ^-processor  as  the  building  block 
of  the  LLM  arithmetic  unit.  Figure  74  illustrates  the  resulting  architec- 
ture for  the  LLM 

The  LLM  is  organized  around  a single  16-bit  data  bus  (MDTA)  within 
the  CPU.  Memory,  I/O  and  CPU  data  are  all  transferred  over  this  bus. 

Two  groups  of  8-bit  wide  2901’ s are  used  to  process  data  and  form  the 
register  file  for  the  LLf*l.  Registers,  MORI  and  MOR2  are  memory 
operand  registers  used  as  intermediate  buffer  registers.  SCT  is  a 5-bit 
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counter  register  used  as  a sequence  counter  for  multiple  clock  micro- 
program routines. 

4.  4.  2 Control  Structure 

The  control  portion  of  the  LLM  is  comprised  of  a 512-word  by  64-bit 
microprogram  store  contained  in  Read  Only  Memories  (ROM's).  A micro- 
program sequencer  (such  as  the  AM-2911)  is  used  to  control  the  sequencing 
of  the  microprogram  instructions  for  CPU  algorithm  execution.  Micro- 
program address  sources  may  be  selected  from  either  a microprogram 
jump  field  (JADD  ROM)  or  from  a set  of  ROM's  to  allow  efficient  micro- 
program branch  capability.  Also,  system  flags  may  be  individually  tested 
by  the  microprogram  sequencer  to  facilitate  conditional  microprogram 
branching. 

Each  microprogram  ROM  output  is  followed  by  a holding  register  to 
allow  microinstruction  fetches  to  be  overlapped  with  microinstruction 
execution. 

Discrete  registers  are  provided  for  the  formation  of  the  effective  add- 
ress (EAR)  for  memory  address  instructions  and  for  the  instruction 
counter  (IC).  Each  of  these  registers  and  the  MDTA  bus  are  connected  to 
the  I-BUS  Control  Unit  (ICU)  which  provides  the  interface  to  the  I-BUS. 

The  memories  and  I/O  are  then  interfaced  to  the  I-BUS. 

4.  4.  3 I/O  Or  ganization 

The  I/O  and  memory  system  is  interfaced  with  the  I-BUS  to  provide  a 
standard  interface  for  all  I/O  elements.  Therefore,  a standard  6et  of 
I/O  modules  may  be  developed  and  a LLM  application  configuration  by 
simply  "plugging  in"  the  appropriate  modules.  An  I/O  module  may  be  as 
simple  as  a discrete  interface  or  as  complex  as  a 1553-A  processor 
(figure  75). 

The  memory  system  is  interfaced  similarly  to  an  I/O  device,  through 
the  MIL7  (Memory  Interface  Unit).  Any  memory  technology  (IC,  Core, 
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Figure  75.  LLM  I/O  Organization 
CCD,  etc.  ) may  be  interfaced  with  the  MIU  since  all  memory  timing  is 
performed  in  a ''handshake"  fashion. 

The  interrupt  system  is  interfaced  direc  \y  with  the  I- BUS  and  provides 
s ixteen  levels  of  priority  interrupts  to  the  CPU. 

4.  4.  4 Machine  Operation  and  Timing 

In  order  to  more  fully  understand  the  operation  of  the  LLM,  five 
microprogram  control  routines  will  be  described  in  detail.  The  routines 
are: 

a.  Instruction  Fetch 

b.  Fixed  point  ADD  (Register/Memory) 

c.  SHIFT  Instructions 

d.  Floating  point  ADD 

e.  Multiply  instruction 

A microprogram  flow  chart  is  included  for  each  of  these  instructions 
to  facilitate  the  explanation. 


139 


A,  Instruction  Fateh 

During  the  Instruction  Fetch  eycle,  the  CPU  reads  the  current  2-word 
instruction  to  be  executed  and  saves  it  In  IR,  MORI  and  MOR2,  Referring 
to  figure  ?h , each  step  of  the  microprogram  execution  for  the  Instruction 
Fetch  cycle  is  indicated  as  a separate  block,  Figure  77  provides  the 
detailed  timing  for  the  Instruction  Fetch  cycle. 

The  Instruction  Fetch  begins  by  passing  the  IC  to  the  ICU  and  requesting 
a memory  read  operation  from  the  memory  system  (IF1  of  Figure  76).  The 
CPU  Control  then  increments  the  IC  and  proceeds  to  Step  IF2  to  await  the 
completion  of  the  memory  cycle.  When  the  memory  data  is  ready,  the  CPU 
proceeds  to  IF3  and  loads  the  fetched  memory  word  (most  significant 
16  bits  of  the  32 -bit  instruction  word)  into  IR  and  MOR2.  A new  memory 
cycle  is  then  Initiated  to  read  the  second  half  of  the  instruction.  Once 
again,  the  IC  is  incremented  and  the  CPU  waits  for  the  completion  of  the 
memory  cycle.  When  the  memory  cycle  has  ended,  the  CPU  proceeds  to 
step  IF5  and  loads  register  MORI  with  the  second  half  of  the  instruction. 

The  instruction  fetch  cycle  is  then  completed  with  the  instruction 
saved  in  MORI  and  MOR2.  The  CPU  next  proceeds  to  execute  the  ins- 
truction before  returning  to  the  Instruction  Fetch  cycle.  Figure  77 
illustrates  the  detailed  timing  for  this  sequence  of  events, 
b.  Fixed-Point  ADD 

The  Fixed-Point  ADD  performs  a parallel  16  — bi t two's  complement  ADD 
of  an  accumulator  register  (RA)  and  a memory  operand.  The  sum  is  placed 
in  RA  and  the  appropriate  arithmetic  flags  are  sampled. 

Referring  to  figures  78  and  79,  the  CPU  begins  execution  of  the  ADD 
instruction  by  calling  a micro-program  subroutine  to  compute  the  effective 
address  of  the  memory  operand.  The  subroutine  returns  the  calculated 
address  in  the  EAR  register. 
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Figure  78.  Fixed-Point  ADD  Flow 
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During  stop  ADDl,  the  affective  address  Is  passed  to  the  memory  and  a 
memory  read  U Initiated.  The  CPU  waits  for  the  memory  to  complete  its 
read  cycle  In  ADD2.  When  completed,  the  CPU  loads  the  memory  data 
Into  MGR2  at  Stop  ADD3. 

The  CPU  now  has  both  operands  for  the  fixed-point  ADD  and  completes 
the  ADD  operation  during  Step  ADD4.  During  ADD4,  MOR2  io  enabled 
onto  the  MDTA  bus  and  passed  to  the  2901  m- processor.  The  CPU  control 
ROM's  Instruct  die  microprocessor  to  perform  a 16 -bit  fixed-point  ADD 
to  RA  and  return  the  result  to  RA,  Simultaneously  with  RA  being  loaded 
with  the  sum,  the  three  arithmetic  flags  (.Sign,  Overflow,  Zero)  are 
updated  to  reflect  the  results  of  the  arithmetic  operation. 

The  CPU  has  now  completed  the  ADD  instruction  and  returns  to  initiate 
the  next  instruction  fetch  cycle. 

c,  SHIFT  Instruction 

Figures  80  and  81  illustrate  the  execution  of  the  SHIFT  instruction. 
During  SHI  register  R is  repeatedly  shifted  while  SCT  (which  contains  the 
shift  count)  is  decremented,  The  microprogram  sequencer  continually 
tests  the  value  of  SCT  and  causes  microprogram  control  to  be  passed  to 
step  SH2  when  SCT  la  aero.  During  SH2,  the  arithmetic  flags  are  sampled 
and  finally  the  next  instruction  fetch  cycle  is  begun. 

d.  Floating  Point  ADD 

The  floating  point  instruction  performs  a 32-bit  floating-point  ADD  (3- 
blt  exponent  and  24-bit  fractional  mantissa)  between  the  double  register 
pair  (R^ , R^  + ^)  and  the  double-memory  word  designated  as  the  operand. 
The  result  is  returned  to  (R.,  R ) replacing  one  of  the  original  oper- 
anda.  Both  operands  are  assumed  to  be  normalized  floating  point  numbers 
and  their  sum  in  normalized  prior  to  placement  in  (R  , R ). 

A A + l 

For  purposes  of  discussion  let  R represent  the  exponent  portion  of  the 

register  operand  while  M represents  the  exponent  portion  of  the  memory 

£ 

operand. 
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Figure  80'  Shift  Instruction  Flow 

Referring  to  figure  82,  the  algorithm  begins  with  an  effective  address 
calculation  for  the  memory  operand.  The  double-word  memory  operand 
is  then  read  from  memory  and  the  most  significant  half  saved  in  MOR2 
while  the  least  significant  half  is  saved  in  MORI.  The  floating-point 
algorithm  now  being  with  microprogram  step  FPAl. 

During  FPAl,  R (exponent  field  of  the  register  operand)  is  transferred 
into  EREG  of  (he  Exponent  Arithmetic  Unit  (see  figure  74) . The  next 
microprogram  step  performs  an  "excess  128"  subtract  in  the  exponent 
arithmetic  unit  forming  (R  - M_).  This  represents  the  exponent  diffe- 
rence  (AEXP)  of  the  two  numbers  and  will  be  used  to  indicate  which  operand 
needs  to  be  adjusted  (shifted  right). 

The  operand  adjustment  algorithm  begins  at  FPA3  where  the  sign  of 
hEXP  is  tested  to  determine  which  operand  is  to  be  adjusted.  Assuming 

that  R > M , the  control  proceeds  to  FPA4. 

E “ E 
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Figure  8] . Shift  Timing 
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Floating  - Point  ADD 


148 


Lf  the  exponent  differencing  did  not  overflow  then  the  microprogram 
proceeds  to  FPA6  where  it  tests  to  see  if  the  memory  operand  may  be 
s uccessfully  scaled.  If  the  AEXP  value  in  EREu  is  greater  than  or  equal 
to  24,  then  no  further  calculations  need  be  performed  and  the  register 
operand  will  be  the  answer.  However,  if  the  memory  operand  can  be 
successfully  scaled,  the  microprogram  proceeds  to  FPA7  where  PLA#l  is 
used  to  sign  extend  the  mantissa  through  the  exponent  field  of  the  memory 
operand  in  MOR2.  Next,  the  register  pair  (MOR2,  MORI)  is  shifted  right 
AEXP  places  in  a microprogram  subroutine.  The  memory  operand  is  now 
appropriately  scaled  for  mantissa  addition. 

FPA8  loads  EREG,  with  the  answer  exponent  (R  ) and  proceeds  to  FPA9 
where  the  exponent  field  of  (R^,  + is  sign  extended  in  preparation  for 

the  mantissa  add  operation  of  step  FPAIO.  After  the  mantissas  are  added, 
microprogram  control  is  passed  to  a normalize  subroutine  where  the 
answer  mantissa  is  shifted  left  until  it  is  appropriately  normalized.  Of 
course,  with  each  shift  left  required  for  normalization,  the  answer  expon- 
ent in  EREG  is  decremented.  Upon  completion  of  the  normalization 
subroutine,  the  answer  exponent  in  EREG  is  assembled  into  R^.  and  the 
instruction  is  complete, 
e.  Multiply 

The  fixed-point  multiply  is  performed  entirely  within  the  2901  micro- 
processor using  a one  bit  at  a time  repeated  add  algorithm. 

Referring  to  figure  83,  the  multiply  algorithm  begins  with  an  effective 
address  computation  followed  by  an  operand  fetch  for  the  multiplicand.  The 
multiplication  "setup"  begins  with  step  MPY1  by  transferring  the  multiplier 
to  the  Q register  within  the  2901  microprocessor.  MPY2  loads  the  constant 
15^  from  PLA#1  (see  figure  79)  into  SCT  and  shifts  Q one  place  right 
entering  the  least  significant  multiplier  bit  into  the  S flip  flop.  Next,  R is 
cle'ared  during  MPY2  to  act  as  the  partial  sum  register  for  the  multiply. 
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Figure  83.  Multiply  Flow 


The  repeated  sums  are  performed  during  step  MPY4  using  the  S flip- 
flop  to  control  the  add  operation  within  the  2901.  As  each  sum  is  formed 
the  result  and  multiplier  are  shifted  one  place  right  to  form  the  next  part- 
ial sum.  The  process  now  continues  until  15  partial  sums  are  formed  at 
which  time  control  is  transferred  to  MPY5. 

In  accordance  with  the  rules  for  performing  two's  complement  multipli- 
cation, MPY5  tests  the  sign  of  the  multiplier  to  determine  if  a correction 
cycle  for  the  partial  sum  is  necessary.  If  required,  MPY6  performs  the 
required  subtraction.  MPY7  adjusts  the  partial  sum  for  integer  repre- 
sentation while  MPY8  moves  the  least  significant  half  of  the  product  into 

R . .to  complete  the  instruction. 

A + 1 

4.  4.  5 Execution  Times 

Instruction  execution  times  for  the  LLM  are  a function  of  two  criteria. 
First,  the  memory  speed  has  a direct  impact  upon  both  instruction  fetch 
times  and  operand  fetch  times.  Secondly,  the  internal  circuit  delays  of 
the  LLM  dictate  a maximum  frequency  for  the  CPU  clock.  Using  a one 
microsecond  core  memory  for  instructions  and  data  with  a four  megahertz 
system  clock,  the  following  typical  instruction  times  are  achievable: 

LOAD  3,  0 nsec 

ADD  3.  0 u sec 

STORE  3.  0 ^sec 

SHIFT  2.  25  + (N-l)  0.  25  .Msec 

MPY  8.  5 Msec 

FP  ADD  (average)  10,  5 ^sec 

4.  5 PHYSICAL  DESCRIPTION 


Using  the  machine  organization  shown  in  figure  74,  an  estimate  of 
parts  was  made  to  "size"  the  LLM.  Once  a parts  estimate  was  obtained, 
an  estimate  of  power  consumption. was  then  made.  For  purposes  of  esti- 
mation, the  memory  parts  and  power  were  omitted  while  the  I/O 
configuration  was  assumed  to  be  a 16-level  priority  interrupt  system. 
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Using  presently  available  parts,  table  9 reflects  the  parts  estimates 
for  the  LLM.  Accordingly,  the  LLM  could  be  fabricated  from  approxi- 
mately 120  currently  available  bipolar  devices.  Using  packaging  techniques 
similar  to  the  present  DAIS  computer,  the  LLM  would  occupy  three 
printed  wiring  boards  and  dissipate  approximately  45  watts. 


TABLE  9 

LLM  PARTS  AND  POWER  ESTIMATES 


ELEMENT 

LSI 

MSI 

SSI 

POWER  (WATTS) 

CPU 

19 

32 

10 

30 

tcu 

9 

16 

15 

10 

I/O 

2 

4 

15 

5 

TOTAL 

30 

52 

40 

45 
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